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Abstract 

In many real world networks agents are initially unsure of each other’s qualities and must learn 
about each other over time via repeated interactions. This paper is the first to provide a methodology 
for studying the dynamics of such networks, taking into account that agents differ from each other, 
that they begin with incomplete information, and that they must learn through past experiences which 
connections/links to form and which to break. The network dynamics in our model vary drastically from 
the dynamics in models of complete information. With incomplete information and learning, agents who 
provide high benefits will develop high reputations and remain in the network, while agents who provide 
low benefits will drop in reputation and become ostracized. We show, among many other things, that 
the information to which agents have access and the speed at which they learn and act can have a 
tremendous impact on the resulting network dynamics. Using our model, we can also compute the 
ex ante social welfare given an arbitrary initial network, which allows us to characterize the socially 
optimal network structures for different sets of agents. Importantly, we show through examples that the 
optimal network structure depends sharply on both the initial beliefs of the agents, as well as the rate 
of learning by the agents. Due to the potential negative consequences of ostracism, it may be necessary 
to place agents with lower initial reputations at less central positions within the network. 

I. Introduction 

Networks are pervasive in all areas of society, ranging from financial networks to organizational 
networks to social networks. And an important feature of many real world networks is that agents 
do not fully know the characteristics of others initially and must learn about them over time. 
For instance a bank learns about the credit-worthiness of a new borrower, a worker in a firm 
learns about the ability of a coworker, and a buyer learns about the product quality of a supplier. 

Document Date: June 2016 

The authors are indebted to lie Xu, Peng Yuan Lai, William Zame, and Moritz Meyer-ter-Vehn for valuable assistance. This 
paper also benefited from discussions with Matt Jackson and seminar participants at the 2015 NEGT Conference and the 2015 
SWET conference. The authors gratefully acknowledge financial support from the ONR. 

Simpson Zhang: simpsonzhang@ucla.edu, Mihaela van der Schaar: mihaela@ee.ucla.edu 



2 


Such learning can strongly affect the resulting shape of the network. As agents receive new 
information, they can revise their beliefs about other agents, update their linking decisions, and 
cause the network to evolve as a result. To properly analyze such network evolution, it is crucial 
to understand the exact mechanism by which learning impacts network dynamics. 

The impact of agent learning on network evolution has not been well studied in the existing 
literature. A large network science literature analyzes the effect of learning on fixed networks 
that have already formed (see Scott (2012)). A smaller microeconomics literature^ studies the 
formation of networks - but makes very strong assumptions (e.g., homogeneous agents/entities, 
complete information about other agents). Neither the network science literature nor the microe¬ 
conomics literature has so far taken into account that agents behave strategically in deciding what 
links to form/maintain/break and that they also begin with incomplete information about others, 
so they must learn about others through their interactions. As a result, neither network science 
nor microeconomics provides a complete framework for understanding, predicting and guiding 
the formation (and evolution) of real networks and the consequences of network formation. 

The overarching goal of this research paper is to develop such a framework. An essential 
part of the research agenda is driven by the understanding that individuals within a network 
are heterogeneous - some workers are more productive than others, some friends are more 
helpful than others, and some borrowers are more creditworthy than others. Furthermore, these 
characteristics are not known in advance but must be learned over time via repeated interactions. 
The rate of learning itself may also be strongly influenced by the network structure: agents 
engaged in more interactions are likely reveal more information about themselves. 

As a motivating example, consider a group of financial institutions that are linked together in 
a financial networlQ These financial institutions provide benefits to each other by engaging in 
mutually beneficial trading opportunities, such as providing each other with liquidity or investing 
in joint venture^ High quality institutions are likely to develop a high realized quality of 
assets from these joint interactions, while low quality institutions are likely to develop a low 
realized quality of assets. Each institution only continues to link with another institution (over 
time) if the counterparty is believed to be of sufficiently high quality. As time progresses, the 

'See the overview in Jackson (2010) for instance. 

^Our model can also be applied to a wide range of other networks, such as organizational networks, social networks, or 
expertise networks. We discuss some implications for these settings as well throughout the paper. 

^As in the model of Erol (2015). 
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institutions observe the actions of their counterparties, update their beliefs about the quality 
of each counterparty, and change their linking decisions as a result. In this way, learning 
by the financial institutions causes the network topology to evolve over time. The network 
topology also impacts the rate of learning, as an institution can learn through both its own 
interactions with counterparties, as well as by monitoring interactions of a counterparty with its 
other counterparties. Since institutions with more connections interact with more counterparties, 
such institutions will reveal more information about themselves to their neighbors over time. As 
a result, while having more connections opens an institution up to more beneficial opportunities, 
it also carries the risk of causing the institution to be shut out of the financial network more 
quickly if it starts losing asset value, as in the case of Lehman Brothers due to its exposures to 
the subprime mortgage market during the 2008 financial crisis. 

Our model takes into account the features of the previous example: agents behave strategically, 
begin with incomplete information about each other, and must learn through continued interac¬ 
tions which connections to form and maintain and which to break. We consider a continuous 
time model with a group of agents who are linked according to a network and who send noisy 
flow benefits to their neighbors. The benefits that agents provide could be interpreted for instance 
as the benefits that financial institutions derive from providing liquidity to each other or from 
diversifying risk with each other’s specialized assets. Each agent is distinguished by a fixed 
quality level which determines the average value of the flow benefits it produces. Agents observe 
all the benefits that their neighbors produce, and they update their beliefs about a neighbor’s 
quality via Bayes rule. Neighbors with more connections will reveal more information about 
themselves over time. Agents will maintain links with neighbors that provide high benefits, but 
will cut off links with neighbors that provide low benefits. The network evolves as agents learn 
about each other and update their beliefs. Since the number of links an agent has affects the rate 
of learning about that agent, the rate of learning about an agent changes as the network changes, 
leading to a co-evolution of network topology and information production. 

Our model is highly tractable and allows us to completely characterize network dynamics and 
give explicit probabilities that the network evolves into various configurations. In addition, we 
are able to characterize the entire set of possible stable networks and analytically compute the 
probability that any single stable network emerges. This allows for predictions regarding which 
types of stable networks are likely to emerge given an initial network. 

We also study the implications that learning has on the social welfare and efficiency of a 
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network. Our results show that learning has a benefieial aspeet: agents that are of low true 
quality are likely to produee low signals and will eventually be ostraeized from the network. 
Learning also has a harmful aspeet: even high true quality agents may produee an unlueky string 
of bad signals and so be foreed out of the network. Moreover, even having low true quality agents 
leave the network ean reduee overall soeial welfare. A marginally low quality agent may harm 
its neighbors slightly, but it also reeeives a large benefit if its neighbors are of very high quality. 
Therefore if the low quality agent leaves the network, the overall soeial welfare would aetually 
deerease. The issue here is that agents only eare about the benefit their neighbors are providing 
them, but not the benefit they are providing their neighbors. This results in a negative externality 
every time a link is severer^ In many situations, the negative effeets of learning outweigh the 
positive effeets, so on balanee learning is aetually harmful. In partieular, inereasing the learning 
rate about marginal agents whose neighbors are high quality agents is bad, beeause foreing the 
marginal quality agent out of the network saerifiees the soeial benefit of the link to the high 
quality agent. However, inereasing the rate of learning about a marginal quality agent whose 
neighbors are also marginal quality agents is good, beeause more information will be revealed 
about that marginal quality agent, allowing its neighbors to more quiekly sever their links to it. 
The impaet of learning ean therefore be either positive or negative depending on the speeifie 
network. 

Our welfare results have important implieations for network planning and are useful in a 
diverse range of settings, sueh as in guiding the formation of networks by the polieies of a 
finaneial regulator, human resourees department, online eommunity, ete. Due to the varying 
effeets of learning, we show that the optimal network design will be quite different for different 
groups of agents. For instanee, when agents all have high initial reputations, the optimal network 
design allows all agents to be eonneeted (so that agents ean benefit fully from their repeated 
interaetions). On the other hand, if some agents have low initial reputations, then allowing all 
agents to eonneet is not optimal, and it will be desirable to eonstrain the network by isolating low 
reputation agents from eaeh other. If sueh agents did link, they would both send more information 
about themselves through this link, eausing themselves to be ostraeized more quiekly. Eaeh agent, 

''The negative effects of ostracism can be particularly acute in financial networks during times of distress in which banks get 
shut out of funding, as is the case of a liquidity freeze. Ostracism has also been demonstrated in a wide variety of social settings 
in the social psychology literature. We discuss this literature and our model’s implications in the Literature Review section. 
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as well as the overall network, eould then be worse off through the formation of this link due 
to the faster learning eaused by the link. In some eases, a star or a eore-periphery network 
eonnectivity strueture would generate higher soeial welfare than a eomplete network even when 
all agents have initial expected qualities higher than the linking cost. Such a situation arises for 
instance if there are two separate groups of agents, one group with very high reputation and 
the other group with moderate reputation. By placing the high reputation agents in the core and 
the moderate reputation agents in the periphery, the high reputation agents are able to produce 
large benefits for the network, and the potential harm from the moderate reputation agents is 

minimizecS 

Finally, we consider four extensions of our model that allow for even richer network dynamics 
and learning. In the first extension, we allow the mechanism designer to provide the agents with 
a subsidy that encourages linking The effect of such a subsidy is to promote the amount of 
experimentation done by the agents, and we show that a sufficiently large subsidy can always 
improve overall social welfare because of this. In the second extension, we allow for agents with 
high enough reputations to form new links with each other, and we show that social welfare will 
be increased when the linking threshold is high enough. In the third extension, we allow new 
agents to enter the network over time, and we consider the optimal time at which new agents 
should arrive. We show that all agents should be allowed to enter the network eventually, but 
delayed entry is desirable in certain networks to protect the reputations of vulnerable incumbent 
agents. Lastly, in the fourth extension we allow for agents that have been ostracized in the past to 
re-enter the network after a set period of time, and we show that the negative effects of learning 
can be mitigated if re-entry occurs frequently enough. 

II. Literature Review 
A. Relation to Theoretical Networks Literature 

Our paper represents a novel contribution to the network formation literature, by being among 
the first to consider incomplete information and learning in networks, as well as by providing a 
tractable model that allows for the computation of many properties, including the ex ante social 


^This provides a new reputational reason for the benefits of a core-periphery network, in contrast to other, non-informational, 
reasons that have been proposed in the networks literature. 

®For instance a financial regulator could guarantee transactions within a financial network to make them less risky. 
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welfare, of different network topologies. Other papers in the network literature have usually 
studied network dynamics only in settings of complete information when agents perfectly know 
each other’s qualities. For example, the papers by Jackson and Wolinsky (1996), Bala and Goyal 
(2000), Watts (2001), and Galeotti and Goyal (2010) all consider networks where the agents have 
complete information. In these models, agents are aware of the exact qualities of all other agents 
and there is no learning. The network dynamics arise instead from externalities and indirect 
benefits between agents that are not directly linked. When one link is formed or severed, the 
benefits produced by other links changes as well, which causes the other agents to sever or 
form their own links in a chain reaction. For some networks, such as communication networks, 
these indirect benefits seem important, as an agent who has many high quality neighbors will 
likely be able to transmit higher quality information as well. However, in other networks such as 
friendship networks these indirect benefits are less relevant and it is the specific quality of each 
individual agent that is the most relevant. This is especially applicable in situations where a new 
group of agents are meeting for the first time, and they learn about each other through mutual 
interactions. We argue that the network dynamics in such situations are more greatly dependent 
on reputational effects and than on changes in the values of indirect benefits. 

We do not assume any indirect benefits in our model and focus instead on the dynamics 
resulting from incomplete information and learning. Agent learning strongly influences the 
network formation process in a way that would not arise with complete information. Agents 
that send good signals will develop high reputations and remain in the network, whereas agents 
that send bad signals will develop low reputations and eventually become ostracized by having 
their neighbors cut off links. The rate of learning about an agent’s quality affects how quickly 
the network evolves and has a strong effect on the resulting social welfare. With complete 
information however, such dynamics would not occur because agents would know each other’s 
qualities perfectly at the onset. For instance. Watts (2001) considers a dynamic network formation 
model where agents form links under complete information. When there are no indirect benefits 
between agents in that paper’s model, each agent would make a one time linking decision with 
every other agent and never update its choice later on. But with learning, agents may change 
their linking choices by breaking off links with neighbors that consistently produce low benefits. 
Incomplete information causes links to fluctuate dynamically over time as new information 
arrives and beliefs are updated, instead of staying static as in the complete information case. We 
propose that such effects are key and even the main driver of dynamics when a group of agents 
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are meeting for the first time and forming a network with each other. 

In addition, the tractability of our model allows us to explicitly compute the social welfare for 
different network structures even under incomplete information. This tractability arises from the 
use of continuous time diffusion processes in our model, which allows for closed form equations 
of the probabilities that different networks emerge. In contrast other networks papers such as 
Jackson and Wolinsky (1996) and Bala and Goyal (2000) use discrete time models that do not 
allow for such clean closed form expressions. While these other papers analyze the efficiency 
properties of a given fixed network, our welfare results are much stronger and allow the network 
to evolve endogenously over time as agents learn and update their linking decisions. This enables 
us to compare the ex ante optimality of different initial network structures, as well as provide 
general results for when specific network structures are optimal. For instance, we show that 
when the rate of learning in the network is either very slow or very fast, a complete network is 
optimal if the agent’s initial expected qualities are all higher than the cost of maintaining a link. 
But when learning is at an intermediate rate, it may be optimal to prevent vulnerable agents 
from connecting, even if their initial expected qualities are higher than the linking cost, due to 
the negative externalities associated with reputational effects. Such a result cannot arise under 
complete information, where if agent’s qualities are all perfectly known it would be strictly better 
for all of them to be linked initially. 

This paper is also tied to the literature on observational learning in networks. Works such 
as Golub and Jackson (2010), Acemoglu et al (2011), and Golub and Jackson (2012) analyze 
observational learning in social networks. In these models there is a fixed exogenous network on 
which the agents interact, and the agents learn about an exogenous state of the world through this 
network by observing the actions of neighbors. These papers provide results regarding the speed 
and accuracy of the observational learning that can be achieved by agents connected through 
different types of networks. Our paper is significantly different from this literature because agents 
learn about other agents’ qualities instead of an exogenous state of the world. As such, agents will 
wish to update their linking decisions over time as their beliefs about the agents with whom they 
are connected with change. The network and learning co-evolve, causing the network structure 
to evolve endogenously. 

Vega-Redondo (2006) focuses on the issue of moral hazard and monitoring, and it considers the 
diffusion of information about agent actions across a network. It assumes that players engage 
in bilateral prisoner’s dilemma games. Information about player actions diffuses through the 



network, and agents are able to sustain cooperation through punishing defectors. More densely 
connected networks allow for faster information transmission and can therefore sustain higher 
levels of cooperation. The paper analyzes how the structures of the networks that emerge is 
affected by the transmission of information, and it shows through simulations and mean-field 
analysis that the inclusion of network based information can increase network density. Our work 
instead focuses on the issue of adverse selection and on learning about agent types. We show 
that more information can be harmful for welfare because it leads to greater ostracization among 
agents. The tractability of our model also allows us to consider the social welfare generated across 
the entire path of network evolution, as opposed to the welfare of the long run average network. 
We are therefore able to address issues of network design, and we characterize the optimal 
network structure under different environments. We also provide simulations which highlight 
our main results and show the social welfare of different network structures. 

A related networks paper that involves learning with adverse selection is Song and van der 
Schaar (2015). Like us, this paper also considers learning by agents about the types of other 
agents within a network, and it shows how incomplete information and the learning process 
can lead to a wide variety of network structures and dynamics. However, this paper considers 
a discrete time model and incorporates a simplified learning process in which information is 
revealed immediately, after a single interaction. On the contrary, in our model information is 
revealed gradually, and the linking decisions and learning occur simultaneously. Since learning 
takes place gradually instead of instantaneously, we are able to analyze how the precise rate of 
learning affects network dynamics and social welfare. And very importantly, our model allows 
the network structure itself to impact the rate of learning about agents. This assumption is realistic 
as learning is often affected by the number of connections an agent has within the network. We 
show that it has strong implications and necessitates the need for careful planning by a network 
designer to properly control the learning done by agents. In addition, our use of continuous time 
allows our model to be more tractable and able to provide explicit characterizations of the social 
welfare of different network structures. 

B. Relation to Financial Networks Literature 

Our paper is also related to the growing financial networks literature. There have been numer¬ 
ous recent papers which seek to explain the prevalent core-periphery structure of financial net¬ 
works. Such core-periphery financial network structures have been well documented empirically 
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in a variety of markets, such as for municipal bonds (Li and Schiirhoff 2014) and securitization 
(Hollifield et al 2014). The theoretical papers of Chang and Zhang (2015), Farboodi (2015), 
Neklyudov and Sambalaibat (2015), Babus and Hu (2015), and Wang (2016) all propose models 
that seek to explain the prevalence of core-periphery networks. These papers show that features 
such as various forms of dealer heterogeneity can result in core-periphery type structures. 

However most of these papers operate in complete information settings where the types of 
other agents are directly observable. And the papers that do consider incomplete information 
focus instead on learning through investment in information gathering (regarding debt repayment) 
rather than on learning through interactions which are affected by the network structure itself. 
For instance, Babus and Hu (2015) shows that since star networks can allow for efficient mutual 
monitoring by financial institutions, they also lead to more efficient trading. Blasques et al (2015) 
show that the benefits that a core-periphery network provides leads to greater stability over time. 

Our paper also provides a justification for the multitude of real world core-periphery networks, 
as we show that such core-periphery networks maximize social welfare in certain networks where 
agents vary in reputation. However, our result occurs are driven by the presence of reputational 
forces, unlike the previous papers. In our model, a core-periphery network lowers the reputational 
risks for vulnerable low-reputation agents, and can thus prevent them from being shut out of the 
financial network as quickly. 

Furthermore the setting of our paper is different from the setting of the other papers. The 
papers that consider complete information are more relevant for longer time frames and stable 
financial market conditions where informational uncertainty about counterparties is low. We view 
our model instead as describing a short time period with great uncertainty. For instance in the 
aftermath of a financial crisis, banks are very unsure of the solvency of other banks due to 
the difficulty of assessing the quality of their assets. In such situations, banks will be hesitant 
to trade with each other and will carefully attempt to learn the solvency of other institutions 
through observations of repayments. Thus each bank’s reputation evolves over time. Banks that 
obtain low reputations may get shut out of the funding market entirely during liquidity runs, as 
was the case during the collapse of Lehman Brothers in the recent financial crisis. It is important 
for a financial regulator to carefully structure the trading network and control the interactions 
so that such situations can be mitigated. 
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C. Relation to Social Ostracism Literature 

Finally, we note that our model also has important implieations for soeial and organizational 
networks. Our results about the negative externalities of reputational learning highlight the 
damaging impaets of ostraeism found in the soeial psyehology literature. Soeial ostraeism is 
a prevalent foree that has been well doeumented in the soeial psyehology literature in numerous 
settings ranging from online interaetions to offlee workplaees. As Williams and Sommer (1997) 
state, “Soeial ostraeism is a pervasive and ubiquitous phenomenon.” In this literature, ostraeism 
ean also oeeur when an agent’s pereeived quality drops too low, and will have harmful effeets 
on the agent itself. As the paper by Wesselman et al (2013) notes, “Ostraeism is a eommon, 
yet painful soeial experienee...Individuals who do not fit the group’s definition of a eontributing 
member may find themselves a likely eandidate for punitive ostraeism”. That paper shows the 
oeeurrenee of ostraeism via an online experiment, where agents differ in their ability to play 
a game. Agents who play badly became ostracized by the others. This effect is similar to our 
model, where agents who are learned to be of low quality are ostracized from the network. 

Ostracism can also occur in workplaces, as some employees may be ostracized by their 
coworkers. Robinson et al (2012) notes that “not only are such experiences extremely painful, 
but under some circumstances they can have an even greater negative impact than other harmful 
workplace behaviors such as aggression and harassment.” It is therefore important for companies 
to consider the harmful effects of ostracism that can occur through workplace interactions. 
We provide guidelines for minimizing the negative effects of ostracism through placing lower 
reputation agents in less central positions of the network. 

III. Model 

A. Overview 

We consider an infinite horizon continuous time model with a finite set of agents denoted 
by V = {1,2,...,A^}. At every moment in time, the agents choose which other agents to link 
with, and a link is established only under mutual consent. These choices are made subject to an 
underlying network constraint Q = {uij} that specifies the pairs of agents that are able to link 
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with each otheij^ For each pair of agents ujij = 1 if agents i and j can connect with each other 
and uiij = 0 otherwise. We call agents i and j neighbors if they can connect. Initially (time 
t = 0), agents are linked according to a network = {gij} C fl. As the network will change 

over time, we denote G* as the network at time t. Moreover, we let kj = gk be the number 

j 

of links that agent i has at time t, and we let denote the set of neighbors of agent i at time 

t. 

Agents receive flow payoffs from each link equal to the benefit of that link minus the cost. 
Each agent i must pay a flow cost c for each of its links that is active. Hence, at time t, agent i 
pays a total cost of kjc for all its links. Agents also obtain benefits from their links, depending 
on their linked neighbors’ qualities However each agent’s true quality is initially unknown to 
all agents, and we do not require that agents know their own qualities. At the start of the model, 
each agent i’s quality g* is drawn from a commonly known normal distribution af) with 

/ij > c. Both the mean and the variance are allowed to vary across agents, and several of our 
results below will utilize this heterogeneity. Agent i generates a different noisy benefit bij{t) for 
each agent j that is linked to it, and these benefits follow a diffusion dbij(t) = qidt+r^ ' dZij{t), 
where the drift is the true quality g^ and the variance depends on r,, an exogenous parameter we 
call the signal precision of agent ^ Zij{t) is a standard zero-drift and unit variance Brownian 
motion, and represents the random fluctuations in the benefits of each interaction. Zij{t) is 
assumed to be independent over all i and j, and therefore all the benefits produced by agent 
i are conditionally independent given qi. We assume that all the benefits that agent i produces 
are observed by all the neighbors of i, which ensures that agent i’s neighbors all have the same 

^This network constraint may arise from the specific interests/desires of the agents regarding who they want to link with, or 
from potential physical/geographical constraints that limit agents from linking. It may also be planned, e.g. through the policies 
of a financial regulator for a network of financial institutions, or by the human resources department in a company for a network 
of employees. 

*We can think of the signal precision as representing how much information the agent reveals about itself in each interaction, 
with a higher precision corresponding to more information. It could depend on the type of interaction with the agent (e.g. close 
partnerships or chance encounters), or factors like the agent’s personality. 
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beliefs about i at any point in time (information is locally public among agent i’s neighbors 
For each agent i, we define the agent’s benefit history as the history of all previous benefits, 

n = {h%Y,=o- 

We assume that agents are myopic, and they thus consider only the current flow benefit when 
making linking decision^ Each agent’s utility is assumed to be linear in the benefits provided 
by each link and the linking cost. This also implies that agents are risk neutral and so consider 
the expectation over neighbor qualities when there is uncertainty. The flow utility of agent i at 
any time t is given by the following equation: 

(7. = (Bfe|W‘] - c) (1) 

{j&Kj} 

B. Reputation and Learning Speed 

Since we have assumed a diffusion process, a sufficient statistic for all the individual link 
benefits is the average benefit per link produced by agent i up to time t, which we denote as 
Given our above assumptions, Bi{t) follows a diffusion dBi{t) = qidt + {k\Ti)~^/‘^dZi{t) 
where the drift rate is the true quality the instantaneous volatility rate depends 

on the number of links agent i has at time t, and Zi{t) is the standard Brownian motion with 
zero-drift and unit-variance. Importantly, this equation shows that the more links an agent has, 
the lower its volatility rate and the faster its true quality is learned. This is because an agent 
with more links produces more individual benefits, and so the average over all benefits is more 
precise. Note also that an agent with no links would not send any information, and thus there 
would be no learning about its quality. Therefore the topology of the network strongly affects 
the rate of learning about each agent’s quality. 


*This is an important assumption to maintain the tractability of the model. It can be interpreted, for instance, through an 
online expertise network where the output of agent i is public, so that all neighbors of agent i can judge the benefit that i has 
provided to all its links. Or in an offline setting, we could assume that the neighbors of agent i are continuously discussing the 
benefits they have received from i with all other neighbors of i, so that the neighbors maintain the same beliefs. For most of 
our results, the information does not need to be fully public; the information regarding agent i needs only be available to all 
the direct neighbors of agent i. 

*°Such an assumption is common within the networks literature to maintain tractability, see Jackson and Wolinsky (1996) 
or Watts (2001) for instance. Myopia is an appropriate assumption in financial networks where firm managers have myopic 
incentives. Such myopic incentives have been documented empirically in papers such as Jacobson (1993) and Mizik (2010). We 
relax this assumption in the extensions section where we allow for subsidies that change agent linking strategies. 
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If at time t all links of agent i are severed, then no benefit will be produced by agent i and 
this will be denoted as bj = 0. In this case no information is added and hence, the diffusion of 
agent i is stopped at its current level. As mentioned, there is a prior belief of an agent i’s quality 
and agents will update this belief in a Bayesian fashion in light of the observations 
of flow benefits. These observations combined with the prior quality distribution will result in 
a posterior belief distribution of agent i’s quality /{qil'Hj) which is also normally distributee!^ 
We denote /i- = £'[gj|'H-] as the expected quality of agent i given the history "H- and call it the 
reputation of agent i at time t. The reputation represents the expected flow benefit of linking 
with agent i at time t. 

We have assumed that agents are myopic. Therefore, to maximize flow utilities, agent i will 
cut off its link with agent j once agent j’s reputation falls below the linking cost c. Since 
we assume all agents have homogeneous linking costs, and all neighbors have the same beliefs, 
any other agent that is linked to j will also decide to sever its link. From this moment on, agent 
j is effectively ostracized from the network; since it no longer has any links it cannot send any 
further information that could potentially improve its reputatior[^ While in the base model an 
ostracized agent cannot return to the network, we relax this assumption in the extensions section. 

IV. Network Dynamics and Stability 

A. Network Dynamics 

The dynamics of the model evolve as follows: all pairs of agents that are neighbors according 
to the network constraint D will choose to link at time zero, since we have assumed that all agents 
have initial reputations higher than the cost c (any agent with an initial reputation lower than c 
is immediately ostracized from the network and would not need to be considered). Therefore the 
initial network at time 0 will be the same as the network constraint, = Vt. Over time agents 
that send bad signals will have their reputations decrease, and once an agent’s reputation hits c 
its neighbors will no longer wish to link with it. All its neighbors will sever their links and the 
agent is effectively ostracized from the network. We will show that this always happens for an 

**As mentioned a sufficient statistic for the entire history is so a neighbor only needs to know Bi{t) in order to 

calculate this posterior. 

'^Although ostracism may seem harsh, as we noted earlier ostracism is a prevelant phenomenon that has been widely studied 
in the social psychology literature, in settings ranging from online interactions to office workplaces. Furthermore, in financial 
networks low reputation institutions may get shut out of funding completely during liquidity crisis. 
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agent with true quality < c, and will still happen with positive probability for an agent with 
quality > c. The ostraeization of an agent will affeet its former neighbors as well. Sinee they 
now have onee less link eaeh, they will produee information about themselves more slowly than 
before, and so their reputations will be updated less quickly. 

The remaining agents in the network will continue to link and send signals until someone 
else’s reputation drops too low and that agent is also ostracized. This process will continue 
until the qualities of all the remaining agents are known with very high precision and in the 
limit their reputations no longer change. Since agent qualities are fixed, by the law of large 
numbers any agents that remain in the network will have their qualities learned perfectly in the 
limit as f —)■ cx), and the network will tend towards a limiting structure that we call the stable 
network. The next section will explicitly characterize these stable networks, but we note that 
many different stable networks could potentially emerge depending on the true qualities of the 
agents and the signals they produce. Figure shows the different network dynamics that could 
emerge even if the initial reputations of the agents are fixed, due to the uncertainty about the 
true qualities of the agents as well as the randomness in the signals they send. 

B. Stable Networks 

As mentioned, we call the limiting network structure as t goes to infinity, denoted by a 
stable network. Formally, let = limi^oo G^. This limiting structure always exists since agent 
qualities are fixed, so by the law of large numbers any agent that remains in the network will 
have its quality learned to an arbitrary precision over time. The probability that an agent who 
is still in the network at time t ever becomes ostracized must therefore tend to zero as f —)• cx) 
(we show this analytically below). Which specific stable network eventually emerges is random 
and depends on the signal realizations of each agent. The tractability of our model allows us to 
explicitly characterize the set of stable networks that could emerge given a set of agents and a 
network constraint kl, as well as the impact of the rate of learning on the probability distribution 
over stable networks. 

To understand which stable networks G°° can emerge, we investigate whether a link kj between 
agents j can exist at t = oo. If two agents i and j are not neighbors (i.e. Uij = 0), then it 
is certain that g“ = 0. If two agents i and j are neighbors (i.e. Uij = 1), then the existence 
of this link kj at t = oo requires that the reputations of both i and j never hit c for all finite 


15 



Stable Networks 
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o 

o o 

Many others 


Fig. 1. Illustration of Possible Network Dynamics: From the same initial reputations for the red and blue agents, many different 
network dynamics and stable networks are possible. In the top graph the red (larger circle and bolded line) agent has a true 
quality less than c and so will be ostracized from the network for certain at some time, while the blue (smaller circle and thin 
line) agent has a true quality above c and so may or may not be ostracized from the network depending on the signals it sends. 
Each event leads to a different stable network, either with and one without the blue agent. In the bottom graph it is the blue 
agent who has a true quality lower than c and so will be ostracized for sure, whereas the red agent could potentially stay in the 
network indefinitely. 


t, which means that neither agent is ever ostraeized. Henee G°° will always be a subset of the 
initial network G^, and is eomposed only of agents whose reputations never hit c for all finite t. 

We say that an agent is included in the stable network if their reputation never hits c for all 
t, so that they are never ostraeized from the network. 

Note that being ineluded in the stable network does not imply that an agent has any links in 

*^As a technical note, when we make the ostracization classification, we assume that an agent who has all its neighbors 
ostracized continues to send information about itself at its signal precision level, with the signals sent via the same probability 
distribution which is based on its true quality. So we still considered the agent “ostracized” if its reputation drops to c via this 
information process even after all its neighbors have been ostracized. This assumption is made for technical purposes only and 
has no impact on the dynamics or the welfare of the model, as the agent has no links in this case. 
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the stable network, as it eould also be that all of the neighbors of that agent were ostraeized 
even though the agent itself was not. We ean ealeulate the ex ante probability that an agent 
i is ineluded in the stable network, whieh we denote by P{Si) with Si denoting the event in 
whieh agent i is ineluded in the stable network. This ean be aeeomplished using standard results 
regarding Brownian motion hitting probabilities, sinee P{Si) is equal to the probability that the 
agent’s reputation never hits c for all finite t. The following proposition gives this probability. 

Proposition 1. P{Si) depends only on the initial quality distribution and the link cost and can 
be computed by 

P{Si) = I (l-exp(— %{pi-c)(qi-c)))(j)({qi-pi)—]dqi/ai (2) 

Jc \ / 


Proof. See appendix. □ 

Proposition has several important implieations. Note that sinee P{Si) is positive and less 
than 1 for all i, no agent is eertain to be ineluded in or exeluded from the stable network. Also 
note that the probability an agent is part of the stable network is independent of that agent’s 
signal preeision r*. Therefore the rate at whieh the agent sends information does not affeet 
the ehanee that it is in the stable network. This is beeause the rate at whieh the agent sends 
information only affeets when it gets ostraeized from the network, but not if it gets ostraeized 
overalp] Furthermore, note that the probability an agent i is ineluded in the stable network is 
independent of its links with other agents and the properties of those agents. Conneetions with 
other agents affeet the rate at whieh an agent sends information but not the agent’s true quality, 
and so will not impaet whether it is eventually ostraeized from the network. 

Using the explieit expression above, we ean also deseribe how P{Si) depends on an agent’s 
initial mean and varianee, pi and cTj. 


understand this intuitively, recall that reputation evolves through Bayes updating of the Brownian motion. A higher 
precision increases the amount of information sent at every moment in time, hut the overall probability distribution of the 
information that is sent across all time remains the same. To see this rigorously, note that in the proof of Proposition [T] in the 
appendix, the survival probability of an agent depends on t; only through the term Therefore increasing and decreasing 
the considered time t proportionally leaves the overall survival probability unchanged. 
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Corollary 1. For each agent i, P{Si) is increasing in the mean of its initial quality fii, de¬ 
creasing in the variance of its initial quality a‘f, and decreasing in the link cost c. Moreover, 
lim^,^oo P{Si) = I, P{Si) = I, limc^_oo P{Si) = 1. 

Proof See appendix. □ 

These properties are intuitive sinee an agent with a higher mean quality and smaller varianee is 
less likely to have its reputation drop below c, and so is less likely to beeome ostraeized. More¬ 
over, lowering the linking eost also reduees the hitting probability sinee the agent’s reputation 
would now have to fall lower to be exeluded from the network. 

As mentioned, must be a subset of G^. Further, it ean eontain links only amongst pairs 
of agents that are both ineluded in the stable network and were linked in the initial network. 
Equivalently, the set of stable networks ean be thought of as the set of networks that ean be 
reaehed from by sequentially ostraeizing agents. Let /{S'*} denote the indieator variable of 
the event in whieh agent i is ineluded in the stable network. Formally, a network ean be stable if 
and only if it is a matrix with entries given by gij = I{Si}I{Sj}I{g^j = 1}, for some realization 
of {Sif S'ijjgy. Links ean exist only among agents that were never ostraeized and were linked in 
the original network. Note that different realizations of {Sif Siji^v could potentially correspond 
to the same stable networlf^ 

By Proposition we know that the rates of learning do not affect the probability of each 
event Si. Since the rate of learning has no effect at an individual level, it cannot have an effect 
at the aggregate level either. This is formalized in the following theorem. We can also use the 
equation in Proposition to derive an analytic expression for the probability that any specific 
stable network emerges, which is presented in the corollary below. Figure in the appendix 
shows an example of how the corollary can be applied to a simple network of three agents. 

Theorem 1. The signal precisions of the agents, {rjjjgy, do not affect the set of stable networks 
that can emerge or the probability that any stable network emerges. 

Proof. It is clear that a network G must be a subset of and can be stable if and only if there 
exists at least one combination of events {Sif Siji^v such that gij = I{Si}I{Sj}I{g^j = 1}. 

'^For instance suppose that the network comprises only two agents i and j. Then the event in which Si hut not Sj occurs 
and the event in which both Si and Sj occur lead to the same stable network structure: the empty network. 
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Thus the set of stable networks does not depend on the learning speed. Moreover, according to 
Proposition P{Si) is independent over the different agents and does not depend on the speed 
of learning. Hence the probability that any specific link exists in the stable network exists also 
independent of the learning speed, so the probability of any stable network emerging is also 
independent of the learning speed. □ 

Corollary 2. The probability that a network G is a stable network is given by ^ 

{s*} 

where the summation is over all realizations of {SiG S'jjjgy that correspond to G. 

We have shown that the speed of learning has no impact on the probability that a network G 
is stable. This is intuitive since learning only affects the duration of a link but not its final state. 
However, learning will have a crucial role on the social welfare of a network, which directly 
depends on how long the agents are connected. We will consider the impact of learning on the 
social welfare in the next section. 

V. Welfare Computation 

We will analyze overall social welfare from an ex ante perspective, given only the network 
constraint H and the prior agent quality distributions. Importantly the ex ante welfare is calculated 
before the agent qualities are learned and any signals are sent. This type of welfare is the most 
suitable for the type of design settings we will consider later, as it requires the least knowledge 
on the part of the network designer. Let P{L\j\q, G°) denote the probability that the link between 
agents i and j still exists at time t. Also, let the parameter p represent the discount rate of the 
network designeip^ We can define the overall ex ante social welfare W formally as follows: 

»■= r ... r r ~ 

J qi=—oo J qj^=—oo ■■ Jo 

(3) 

We will show that this social welfare expression can be calculated in a tractable fashion using 
a somewhat indirect approach. This approach utilizes the fact that the ex ante social welfare is an 
expectation over all the possible ex post signal realizations. We can calculate the ex ante welfare 


'^’We are assuming that the designer itself is more patient than the myopic agents. This can be thought of, for instance, as a 
company manager who is more patient than its workers who act myopically in their interactions, or a financial regulator that is 
more patient than the financial institutions, which have managers with myopic incentives. 
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by integrating over all possible realizations of the ex post welfare, whieh simplifies equation 
to a mueh more traetable form. 


A. Ex post welfare 

Consider an ex post realization of agent hitting times where £ •* denotes the event 

in whieh agent z’s reputation hits c at time U given all the agent signals (note that ti = oo means 
that agent i’s reputation never hits c). In the event in whieh U < oo, sinee the belief at time ti is 
eorreet, the expeeted value of agent i’s quality eonditional on this event £•* is = c. 

In the event with ti = oo, sinee the initial belief is aeeurate in expeetation 

/X, = E[qi\ = Pie^<^)E\<i,]e^<^] + P{e^=^)E\p,\e^=^] (4) 

= (1 - P{S,))c + P{Si)E[q,\el*=^] (5) 


and we have 


E[qi\ef 


t—cxoi 


/X, - (1 - P{S,))c 

PiS^) 


( 6 ) 


where P{Si) is given by Propositionand is independent of the network and the learning speed. 

Aeeording to the above diseussion, given an ex post realization e, an agent i obtains 0 surplus 
from its neighbors that have finite hitting times and obtains positive surplus from those whose 
reputation never hits c (and are therefore ineluded in the stable network). The exaet benefit 
agent i reeeives in the seeond ease depends on its own hitting time ti, whieh determines the 
link breaking time with the other agents. We ean ealeulate the ex post surplus that an agent i 
reeeives given e as follows: 


Wi(e) = E. 


q\s 


E 






E 




j-9%='^,tj=oo ' 


Ej - (1 - PiSj))c 
PiSj) 

1 - 


P 


E 


9?j=^P=‘ 


— c)dt 

( 7 ) 

\ 

— c \ dt 

( 8 ) 

Pj — c 

PiSq) 

( 9 ) 


Note that this Wi is taken from the perspeetive of the designer as it ineorporates futures payoffs 
at the diseount rate of p. This equation shows that in eaeh ex post realization of other agent 
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hitting times, agent i benefits if ti inereases and it is ostraeized later from the network. Summing 
over all agents, the soeial welfare given the ex post realization e is therefore 



( 10 ) 


By taking the expeetation over the events e, the ex ante soeial welfare ean be found as 
W = E^\W{e)]. In order to eompute the ex ante soeial welfare, we still need to know the 
distribution of the ti, whieh is eoupled in a eomplieated manner with the initial network and the 
learning proeess. For instanee, if the neighbor of agent i has a low hitting time and is ostraeized 
quiekly, then agent i sends information at a slower rate and its own hitting time would inerease. 
Thus direetly eomputing the soeial welfare using the above equation is still diffioult. In the next 
subseetion, we develop an indireet method to ealeulate the distribution of 

B. Hitting time mapping 

Reeall that an agent’s links will seale up the rate at whieh it sends information eompared to 
the rate it would send information if its preeision were eonstant at the base level of r*. Therefore 
eaeh link also seales down the time at whieh the agent’s reputation hits c. So to ealeulate when 
the agent is ostraeized, we ean first find when the agent’s reputation would hit c through sending 
signals at its signal preeision level, and then seale this time downwards proportionately based 
on the network effeel[^ Consider an ex post realization of hitting times e = In whieh 

agent Fs reputation would hit c at time U if its preeision were fixed at Ti at all times. Note that 
the events are independent from eaeh other aeross different agents, and sinee the preeision is 
fixed they also do not depend on the network strueture. The probability of £** ean be explieitly 
eomputed in the following lemma. 

Lemma 1. The probability density function f{il*),yti < oo can be computed as 



The probability mass point function f{el" °°) = P{Si). 


Proof See appendix. 


□ 
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Refer to footnote 12 for a justification of this type of scaling. 
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Using Lemma we can easily obtain the distribution of joint events f{i) = /{il") due to 

the fact that the individual events are independent. This would measure the joint probability of 
the agents exiting the network at times if the information sending speed of the agents 

were not being scaled by the number of their links. If there were no network effect, the ex 
ante social welfare could be directly computed using the distribution of hitting times given by 
Lemma However, due to the network effect, the actual hitting times may vary for each e. We 
can define M : [0, cx)]^ —)■ [0, cx)]^ be the hitting time mapping function, which maps the hitting 
times with no network effect to the actual hitting times when there is a network effect. In the 
appendix we present an algorithm for computing M, which operates by scaling the information 
speed of each agent at every time t by their current number of neighbors and updating the speed 
at which an agent sends information when a neighbor is ostracized. Note that if = cx) in the 
event £** then it is also oo in the mapped event This means that an agent that never leaves 
the network with no scaling effect will not leave when the times are scaled either. Then given 
a realization e, the ex post social surplus can be computed as 


W(e) = Y. 


P 


j-9%=^,tj=oo 


Pj ^ 
P{S, 


( 12 ) 


Therefore, the ex ante social welfare is lU = Ei\W (e)]. We note that this is a tractable equation 
for the ex ante social welfare given any network structure and set of agents. Proposition gives 
the explicit expression for P{Sj), and Lemma provides the distribution of i. Thus our model 
allows for easy and tractable computations of the ex ante social welfare of any type of network. 
Theorem [2] below formali z es this result. 


Theorem 2. Given U, the initial quality distributions, and the link cost c, the overall ex ante 
social welfare can be computed as follows 


W 


Ee 


E 




Pj — c 
P{^ 


(13) 


where the distribution of i is computed using Lemma and the hitting time mapping function 
M is given in the appendix. 


VI. Impact of Information and Learning 

In this section we study the impact of learning on ex ante welfare, both individual and overall, 
given an initial network In particular, we will show how the agents’ signal precisions, a 
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representation of the rate of learning, impact individual agent welfare as well as the overall 
social welfare. 

As a benchmark, we consider the social welfare when there is no learning, which we denote 
by W*. When there is no learning, no existing link will be severed. The social welfare of an 
agent i without learning can therefore be computed by summing over the mean qualities of all 
agents it is connected with initially: 


1 

W*=J2 - c)dt = - “ 

0 1 0 / • 0 1 
yny =l 'j:qy-=l 


c) (14) 

The ex ante overall social welfare without learning is given by the sum over the individual 


welfares: 


W* = ^W* 

i 


1 

p 


Y. Y 


(15) 


A. Overall Impact of Learning 

Let W{ti, be the ex ante social welfare when agents learn each other’s true quality 

with the signal precisions being ri, We also let lTj(ri, represent an agent i’s ex 

ante welfare given these signal precisions. The next theorem states that in any network, the 
addition of learning has a negative impact on every individual’s ex ante welfare for any value of 
the signal precisions. This immediately implies that it lowers the overall ex ante social welfare 
as well. 


Theorem 3. lTj(ri, < IV* for all i and for all ri, ...Tjq. 

Proof See appendix. □ 

There are two main factors that are at work in this result. First, the myopia of the agents causes 
the learning to be done inefficiently. Second, cutting off a link imposes a negative externality on 
the agent who is ostracized, since that agent can no longer receive benefits from its neighbors. 
Taken together, these factors lead to a reduction in overall social welfare. More precisely, when 
a link lij is severed due to agent j’s reputation hitting c, agent i does not gain welfare compared 
to the case without learning. This is because the expected value of having a link with i from t* 
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on is 0 and thus having the link or not makes no differencepj However, agent j loses welfare 
eompared to the ease without learning beeause agent z’s reputation is still above the link cost 
and thus having the link would benefit j over not having the link. 

This result supports the damaging impacts of ostracism found in the social psychology lit¬ 
erature, which were mentioned above in the literature review. The social psychology literature 
usually documents the harmful effects of ostracism from the perspective of the agents that have 
become ostracized and can no longer benefit from interactions with the other agents. However, 
our result goes further by stating that the possibility of ostracism will actually lower every agent’s 
social welfare from an ex ante perspective. By allowing for the ostracism of others, agents open 
themselves up to ostracism as well, which lowers their own welfare by more than they benefit 
from ostracizing other agents. Theorem shows that every agent is hurt ex ante by ostracism, 
even those that wouldn’t themselves be ostracized in the majority of the ex post realizations of 
the network. 

B. Impact of Individual Information 

The previous result showed that learning is harmful on aggregate: under learning both indi¬ 
vidual and overall network welfare are lower than without learning. However, we show in this 
subsection that learning need not be harmful at an individual level, as the rate that a single agent 
sends information changes. We now investigate more closely how the information generation rate 
of a single agent (i.e. an agent’s signal precision) affects welfare. The faster an agent generates 
information about its own reputation, the faster the other agents will learn its true quality (if the 
link is not broken). 

First we characterize the impact of an agent’s signal precision on that agent’s own welfare. 
The next proposition shows that sending more information about itself will always harm an 
agent. 

Proposition 2. lFi(ri,r_j) is strictly decreasing in Ti. 

Proof Consider any ex post realization e = If U = oo, then changing r* alone does not 

change the fact that agent i would stay in the network forever, as so it does not affect the hitting 

**Agent myopia is causing the cut-off value to be too high, and so the agent does not benefit from its learning. This feature 
of reputational learning is similar to that shown in van der Schaar and Zhang (2014). In Section VIII we discuss a possible 
solution for this problem by providing agents a subsidy to increase experimentation. 
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time realization of any other agent either. Therefore agent i’s welfare Wiie) is not affeeted. If 
ti < oo, then the welfare of agent i depends on (1) the expeeted quality of all the neighboring 
agents j whose tj = oo and (2) its own hitting time U. Sinee (1) is not affeeted by ehanging Ti, 
we only need to study how r* affeets f*. 

Intuitively U is deereasing in r* sinee agent z’s information sending speed is faster due to a 
higher preeision. We provide a more rigorous proof by eontradietion as follows. Suppose agent 
i’s new hitting time inereases to = ti + A > ti. In this new realization, eonsider the duration 
from 0 to ti. Sinee t'i > t i, all other agents’ information sending proeess and speed do not 
ehange before ti. Henee, agent i’s instantaneous preeision at f ehanges to (r/)' = ^r/. 
Henee, information sending by agent i is faster at any moment in time before t^. Sinee, the 
stopping time is larger than ti, the total amount of information sent by agent i given r/ is 
larger than that given r*. Beeause the total information sent should remain the same, this eauses 
a eontradietion. Therefore t'- should be smaller than ti for a larger r^'. □ 

This result is in aeeordanee with Theorem and shows that an agent sending information 
about itself will strietly deerease its own welfare. This is beeause in eaeh realization in whieh 
the agent is ostraeized from the network, the agent will now be ostraeized sooner and henee it 
will enjoy less benefits from others. Sinee the agent already starts out with the maximal amount 
of links it ean obtain, it in effeet has nothing to gain and everything to lose by allowing its own 
reputation to vary. We relax this assumption in the extensions seetion and allow agents to form 
new links with those they are not eonneeted with initially; under those eireumstanees an agent 
will be able to benefit by generating more information about itself. 

Though inereasing the information sending speed is always harmful for an agent itself, it ean 
aetually be helpful to its direet neighbors. The next proposition provides a suffieient eondition 
on the initial network sueh that this holds. 

Proposition 3. Given an initial network G^, for any two initially connected agents i and j that 
are linked through a unique path (i.e. the direct link), increasing one’s precision increases the 
other’s welfare. 

Proof. Consider any ex post realization e = {e-'jjgy If li = oo, then inereasing agent i’s 
signal preeision Ti does not ehange the realization , Henee tj is not affeeted. If < cx), then 
aeeording to Proposition!^ the new hitting time t'- is sooner if agent i’s signal preeision is larger. 
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Fig. 2. Example for Corollary]^ 


This causes the link between agent i and j to be severed (weakly) sooner, leading to a (weakly) 
later hitting time of agent j beeause agent j will send information at a slower speed for a longer 
time. Sinee ehanging agent Ts signal preeision does not ehange the finiteness of the hitting time 
of all other agents, agent j’s welfare inereases due to a longer hitting time for itself. □ 

Sinee the information sending speed of agent j slows after agent i is ostraeized, agent j’s 
hitting time is larger. Agent j therefore prefers its direet neighbor to send more information, so 
that it ean eut off more quiekly in ease the neighbor is bad. After the link is broken, agent j will 
also be able to reveal less information about itself, whieh is benefieial aeeording to Proposition 

In this way agent j would enjoy more benefits for a longer time from its links with its 
other neighbors. We can extend this analysis for more distant agents when the two agents are 
eonneeted through a unique path. This is summarized in the eorollary to Proposition below. 

Corollary 3. Given any initial network G^, for any two agents i and j that have a unique path 
between them, increasing one’s signal precision decreases/increases the other’s welfare if they 
are an odd/even number of hops away from each other. 

The above result shows an odd-even effect of the distance between two agents on the agent’s 
welfare. In all minimally eonneeted networks (sueh as star, tree, forest networks), any two agents 
have a unique path between eaeh other and so the impact of any agent’s information sending 
speed on any other agent’s welfare ean be completely eharaeterized. 

As an example, eonsider a network where four agents i,j,k,l are eonneeted via a unique 
path, as depieted in Figure Agent i is linked with agent j, agent j is linked with agent k, 
and agent k is linked with agent 1. Then if agent i sends more information about itself, it stays 
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connected with agent j for a shorter period of time. This causes agent j to send less information 
about itself, causing agent k to cut off its link with j more slowly if j were to be ostracized. 
Then agent k is able to link with its other neighbors for a shorter length of time in expectation, 
decreasing the ex ante welfare of k. Therefore agent k is hurt when the neighbor of its neighbor, 
agent i, sends more information. However agent I now links with its own neighbors for a longer 
length of time, and so it benefits when i sends more information. However, when there are 
multiple paths between agents, which implies there are cycles in the network, the impact of the 
signal precision of an agent on the other agents’ welfares is much less clear. The reason is that 
with cycles the neighbor of an agent i’s neighbor may also be linked with agent i itsell[^ and 
so the positive and negative effects of information from Corollary are entangled together. The 
following proposition shows that even for an immediate neighbor, the impact could be totally 
opposite of Proposition when cycles are present in the network. 

Proposition 4. If the initial network has cycles, then it is possible that increasing some 
agent’s signal precision decreases its immediate neighbor’s welfare. 

Proof We prove by constructing a counterexample, which is shown in Figure Consider a 
network with K > 3 agents. Agents 1, 2, 3 form a line and the other K — 3 agents connect to 
both and only agents 1 and 2. We assume that agent 3’s true quality is perfectly known (initial 
variance 0) and large. Hence, agent 3’s reputation never hits c. We also assume that the mean 
qualities of agents 4 to Ff are close to c. Hence, agent 2 almost does not gain benefit from those 
agents even when iC —)■ cx). 

Consider a realization in which agent I’s reputation hits c at < oo and agent 2’s reputation 
hits c at ^2 < cxo. By increasing the signal precision of agent 1, its hitting time decreases to 

< fi. If > t 2 , then agent 2’s hitting time is not affected, i.e. ^2 = ^ 2 - Otherwise, the new 
hitting time may be different from ^ 2 - To simplify the analysis, we consider the extreme case 
in which r* — )■ oo, thereby t[ 0. Therefore, agent 2 loses the link with agent 1 from the 
beginning in any realization. However, since agents 3 to iC also lose the link with agent 1 from 
the beginning, for those whose hitting time was earlier than t 2 , their hitting time would increase 
by a factor of 2. If there are at least three agents among 4 to K whose hitting was between 
[^ 274 ,^ 2 / 2 ], agent 2’s information sending speed will increase sufficiently much that agent 2’s 


*^This is known in the social network literature as triadic closure. 
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Fig. 3. Counterexample for Proposition]^ 

hitting time is smaller. By making K large we ean always making the probability of this event 
be large enough. Thus, agent 2’s hitting time will deerease on average. □ 

We have seen that increasing the information sending speed of an individual agent i could be 
both good or bad for other agents depending on their locations in the network and their relation 
with agent i. We note that it could similarly be good or bad for overall social welfare. So in 
contrast with Theorem increasing the amount of information about a single agent can benefit 
the network overall. This would happen for instance, if there are three agents, i, j, and k who 
are connected in a line, with links ij and jk. Suppose that the mean of agent k's quality is 
much higher than those of the other two agents. Then most of the welfare in this network comes 
through the link between agents j and k. If agent i sends more information, agent j would be 
able to preserve its link with agent k for a longer period of time, and overall social welfare 
would increase. This example highlights how critical the network structure is in determining the 
overall impact of more information by a single agent. 

VII. Optimal Networks 

In this section, we study which underlying network constraints O maximize the overall ex 
ante social welfare. Equivalently, we could think of a benevolent network planner that wishes to 
maximize social welfare by designing the network constraint O through designating which agents 
are able to form links with which other agents. For instance, in the financial network setting 
we could think of a regulator that specifies which types of financial institutions are allowed to 





28 


transact with which other types of institutions in order to maximize overall soeial welfarep^ 

A. Fully connected networks 

One intuition is that a fully eonnected network, with no eonstraints on links, would be optimal 
since it results in the largest number of links initially, and we have assumed that all agents have 
an initial reputation higher than the linking eost c. This intuition is aeeurate in eertain oases, 
such as if the designer is extremely impatient (i.e. p —)• oo). Sinoe the designer cares only about 
the initial time period, and when time is short almost no new information oan be learned, it is 
best to design the network based on the agents’ starting reputations. Surprisingly though, the 
fully eonnected network is also optimal on the other extreme, when the designer is completely 
patient (i.e. p —)■ 0). In this ease, the designer eares about the soeial welfare of the stable network 
that eventually develops, and allowing all agents to be eonneeted initially leads to the largest 
probability of links in the final stable network. We prove these welfare results in the following 
proposition. 

Further, note that the designer’s level of patienee is inversely related with the rate of learning, 
as faster learning means that information is revealed sooner and thus less patienee is required. 
Therefore a similar result holds for the rate of learning: as the rate of learning beeomes extremal 
the fully eonneeted network beeomes optimal as well. So for instanee, a finaneial regulator 
should optimally let all types of finaneial institutions transaet with eaeh other if it is very patient 
or very impatient, or the information produetion is extremely fast or slow. 

Proposition 5 . 1. If the designer is either completely impatient (i.e. p ^ oo) or completely 
patient (i.e. p —)■ Oj, the optimal VL is the fully connected network. 

2. Fix the other parameters of the model and suppose the agents’ signal precisions are all 
multiplied by the same constant A. If learning becomes very fast (i.e. \ ^ oo) or very slow (i.e. 
A —)■ Oj, then the optimal VL is the fully connected network. 

Proof See appendix. □ 

^°We note that many other types of objection functions are also possible instead of the overall ex ante social welfare. For 
instance the designer may wish to maximize network welfare generated over a certain time interval, or before a set deadline is 
reached. Or the designer may weigh the welfare of some agents more heavily than that of others. Given the tractability of our 
model, many of our results can be extended for these alternative settings. 
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When the designer is either completely patient or impatient, the social welfare depends only 
on the network or respectively. The exact hitting time does not affect the social welfare. 
Similarly if the learning is very slow, then the network structure always remains at and if 
the learning is very fast then is realized very quickly, so in both cases a fully connected 
network is optimal. The idea is that in both extremes, the exact path of learning is no longer 
critical and so the negative externalities of information are mitigated. 

For intermediate levels of patience or learning however, changes in individual agent hitting 
times due to linking could have a significant impact on the social welfare. We will show later 
that having all agents fully connected with each other is not always the optimal choice. In the 
next proposition though we show that the fully connected network is still optimal in the case 
where the agents are homogeneous and have very high initial qualities. 

Proposition 6. Suppose all agents are ex ante identical. Fixing the other parameters, there exists 
p, such that if Pi > p \/i, then the optimal is the fully connected network. 

Proof. We will prove that for p large enough, the social welfare of any non fully connected 
network will be increased through the addition of any new link. Therefore the welfare of the 
fully connected network will be greater than the welfare of any other network. Consider an 
arbitrary network constraint that is not fully connected. Suppose that a link between agents i 
and j is added to the network, and consider the welfare of the new network constraint kl'. 

First consider the change in welfare of agent i. In any realization where agent i is ostracized, 
its welfare through having the extra link with j decreases by no more than the welfare 

loss when it loses all its links with the other agents immediately. In any realization where agent 
i is not ostracized, its welfare with the additional link increases by the discounted value of 
the new link given the expected quality of agent j. Thus the change in welfare for agent i is 
bounded below by + (1 “ P(5'i)) = p[P[Si) Similarly, we can 

show that the change in welfare for agent j is bounded below by p{P{Sj) 

Now consider the change in welfare for all the other agents in the network. In any realization 
where both agent i and agent j are not ostracized, the hitting times of all the agents in the 
network are unaffected by the new link. In any realization where either agent i or agent j are 
ostracized, the change in welfare for all the other agents is bounded below by 
Thus the total change in welfare for all other agents in the network is bounded below by 
[P(S.)(1 - P(S,)) + - P(S,)) + (1 - P(S.))(1 - p(g^))| ("-iK"-i)>- , 
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Combining the above two observations, we note that the ehange in welfare for the whole 
network is bounded below by jj.[P{Si) +P{Sj) ~ + P{Si){l — P{Sj)) + 

{P{Sj){l — P{Si)) + (1 — P{Si)){l — P{Sj)) When fl is large, P{Si) eonverges to 1 

by Proposition Thus for ft large enough, the lower bound for the change in welfare of agents 
i and j converges to a positive number. 

When /i is large, P{Si) and P{Sj) converge to 1 by PropositionTherefore the lower bound 
for the change in welfare converges to a positive. □ 

B. Core-periphery networks 

As agents become more heterogeneous in terms of their initial expected quality, it can be 
optimal to constrain connections among agents. Suppose agents are divided into two separate 
types, and the initial mean quality of the high type agent is pn while the initial mean quality 
of the low type agent is pi < Ph- We show that when the expected qualities of the two types 
are sufficiently different, the optimal network constraint has a core-periphery structurep^ 

Theorem 4. Suppose that there are two groups of agents, one with initial reputation pL and 
one with initial reputation pu- Fixing all other parameters, there exists p such that Wpn > p, 
the optimal Vt is a core-periphery network where all high type agents are connected with all 
other agents and no two low type agents are connected, (p will depend on the other network 
parameters.) 

Proof. We first show that all high type agents should connect to all other high type agents. This 
is based on a similar argument as in the proof of Proposition Since when pu —)• oo, all high 
type agents will stay in the stable network with very high probability, adding a link between 
any two high type agents will strictly improve their welfare while impacting the welfare of all 
other agents with very low probability. Hence, there must exist a large enough value for pu 
such that the welfare of high type agents is maximized when all high type agents connect to all 
other high type agents in the initial network. 

Next we show that all low type agents should not connect to each other in any network where 
each is linked to at least 1 high type agent. When pu —)■ oo, the welfare obtained by a link with 

^'Although this theorem assumes there are exactly two types, a similar result holds if instead the agents are composed of two 
groups and within each group have parameters that are sufficiently close together. 
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any low type agent j is dominated by that a link with high type agents, i.e. we can suppose that 
the welfare received by a link with another low type agent is approximately zero in comparison 
to a link with the high type agents. Having additional links with other low-type agents reduces 
the hitting time of agent j, Mj{t), in the event that it gets ostracized, thereby reducing agent 
j’s welfare by more than the welfare gain of the additional link. Therefore, low type agents do 
not connect to each other in the optimal initial network. 

Finally we show that all low type agents should connect with every high type agent. Since 
the probability that the high type agent is ostracized approaches zero, such a link does not affect 
them relative to the extra welfare that the low type agents receive. Therefore we consider only 
the effect on the welfare of the low type agent to be connected with all high type agents. In 
a realization where the low type agent is not ostracized, this is optimal for all agents, as the 
high type agent stays in the network with very high probability when hh is large enough. Thus 
both agents have their welfare increased while not affecting the welfare of all other agents. We 
show that it is also optimal in realizations where the low type agent is ostracized. Again we will 
assume that the high type agent is not ostracized, which will hold for jjiH high enough. The low 
type agent receives a flow payoff of from every high type agent that it has an active link 
with. Note that in the hitting time mapping function the hitting time of an ostracized agent i is 
scaled by 1/K, where K is the total number of high type neighbors. Thus the decrease in hitting 
time is exactly balanced out by the increase in flow payoff in the case without discounting, and 
with discounting it is strictly better for the low type agent to have an extra link. 

□ 

The above result shows that under the optimal network constraint, high reputation agents 
should be placed in the core and connected with all other agents, while low reputation agents 
should be placed in the periphery and not connected with other low reputation agents. Therefore 
agents with lower initial reputations should be placed in less central positions within the network 
in order to mitigate the negative effects of ostracism. Allowing low reputation agents to connect 
with too many other agents would increase the rate at which they send information, causing them 
to be ostracized sooner and hurting them more than they would gain through the direct benefits 
of the extra links. This core-periphery structure is commonly seen in many real-world financial 
networks, with large well capitalized banks in the core and smaller banks in the periphery. A 
reason for this could be that the greater reputation of large banks lets them withstand negative 
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shocks more easily without being ostracized by their counterparties. Smaller banks produee less 
information through their lesser number of transaetions, allowing them to avoid being ostracized 
as quickly]^ 

We note that the above result depends heavily on the type of learning environment that is 
present. From Proposition we know that if the designer was either very patient or impatient, 
or if learning was very slow or very fast (relative to the parameters of the agents), then the 
optimal initial network would be the fully eonnected network. Fixing the agent reputations, a 
core-periphery eonstraint structure is only optimal at intermediate levels of learning. 

C. Star Networks 

Star networks are eommon networks in the real world, where a single eentral agent is eonnected 
with many peripheral agents. Examples inelude a single boss and many subordinates, the head of 
a political party that coordinates the disparate branehes of the party, or a large trader that deals 
with many small traders. There are several important forces to consider when placing agents 
within a star network. Sueh networks depend greatly on the central agent, because that agent is 
eonnected with all other agents and it therefore has the most links. The eentral agent is therefore 
the most important agent to consider, and ehoosing the best agent to be in the eenter is erueial 
to the overall welfare of the network. 

The initial mean and the signal preeision of the central agent are two exogenous parameters that 
must be carefully eonsidered when ehoosing the central agent. A high initial mean is benefieial 
because it inereases the expected flow benefits that all the other agents who are eonnected to the 
eentral agent will reeeive. However, a higher signal preeision is harmful because it allows for a 
greater probability that the central agent beeomes ostracized quiekly, thus causing the network 
to fall apart. Such an event would greatly lower soeial welfare. Therefore there is a trade off 
between the initial mean and the signal precision of the central agent: it is desirable to have 
a central agent with a higher mean but a lower signal preeision. In particular, choosing the 
agent based only on its initial mean expeeted quality is not optimal, whereas under eomplete 
information it would be optimal to always place the highest realized quality agent in the center. 

^^We note that financial regulators have started imposing core-periphery structures on various financial networks to encourage 
stability. Many banks are now required to trade through a central clearing counterparty (CCP), which is a large financial institution 
that is ideally very stable. The idea is that trading with the CCP will mitigate the uncertainties that individual banks have about 
each other’s qualities and thus help prevent liquidity runs during financial crisis. 
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We show these results formally in the next proposition. For eonereteness, suppose that the 
eentral agent in the network is denoted by agent 1. The exogenous parameters of the agents are 
defined the same way as previously. 

Proposition 7. The overall social welfare is strictly increasing in ni and strictly decreasing in 
Ti and af. 

Proof. We ean break soeial welfare into two eomponents: the welfare of the eentral agent, and 
the welfares of eaeh periphery agent. Notiee that the welfares of the periphery agents are strietly 
inereasing in /ii but do not depend on cxi or ri for similar reasons as in the proof of Theorem 

Also, the welfare of the eentral agent is strietly inereasing in ni as that allows the eentral 
agent to stay in the network for a longer period of time. Thus overall soeial welfare is inereasing 
in Hi. The welfare of the eentral agent is strietly deereasing in ti for the same reasons as in 
Proposition Thus overall soeial welfare is deereasing in this parameter. □ 

Figure shows the trade-off between the mean and the signal preeision of the eentral agent 
explietly via a simulation. It plots the eontour lines of the overall ex ante welfare of the network, 
and it shows that soeial welfare inereases as the initial mean inereases and the signal preeision 
deereases, and therefore seleeting the best eentral agent depends on both faetors. 

We note that for the periphery agents on the other hand, the exogenous parameters have a mueh 
less elear relationship with the overall soeial welfare. We ean aetually show through examples 
that soeial welfare ean inerease or deerease in eaeh of these faetors for periphery agents. The 
same relationships as for the eentral agent ean hold, and a simple example would be a two person 
network. However a marginally higher mean or a lower signal preeision by a single periphery 
agent ean aetually decrease overall welfare. For instanee, eonsider a network where the eentral 
agent has an initial expeeted quality elose to c, one periphery agent denoted by agent i also 
has an initial expeeted quality elose to c, and the qualities of all other periphery agents is very 
high. In sueh a ease, inereasing the expeeted quality of agent i by a small amount, or deereasing 
agent Fs signal preeision would harm overall soeial welfare. These ehanges would result in the 
eentral agent being eonneeted to agent i for a longer streteh of time, whieh is undesirable sinee 
the other periphery agents are of mueh higher expeeted quality, and so eausing the eentral agent 
to send more information is harmful. Therefore in sueh a network it would be better for agent 
i to send information more quiekly in order for it to exit the network sooner. 
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Fig. 4. Simulation Illustrating Proposition The simulation uses a network of 6 agents. The 5 periphery agents have = 2, 
(Ji = 2, and n = 1. The central agent has af = 2, while its initial mean ranges from 2 to 2.5 and its signal precision ranges 
from 1.7 raised to the power of -100 to -95. 4000 realizations of agent hitting times were taken at each different agent mean, for 
a total of 32000 different hitting times. When drawing realizations across different means, the quantiles of the agent qualities 
were fixed to ensure faster convergence. 


We note that the trade-off identified above matters only if learning is fast enough, whereas if 
learning beeomes very slow (or the designer beeomes very impatient), then this trade-off goes 
away. This is summarized in the following proposition. 

Proposition 8. If the rate of learning becomes very slow (i.e. A —>■ 0), then the optimal star 
network is obtained by placing the agent with the highest initial reputation in the center. 

Proof In the limit of very slow learning, only the initial welfare generated matters, and plaeing 
the agent with the highest expeeted quality in the eenter generates the highest welfare over all 
initial network struetures. □ 

Proposition shows that the deeision to plaee an agent at the eenter depends only on eaeh 
agent’s initial mean in the limit of very slow learning learning (a similar result holds for very 
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high designer impatienee). When the network is eonstrained to be a star network, the highest 
initial welfare is obtained by having the highest initial expeeted quality agent in the eenter if 
the learning is very slow. 

D. Ring networks 

In this seetion we foeus on a speeial type of network: a ring network. Suppose for eonvenienee 
that agents are homogeneous in terms of initial expeeted quality and varianee. Assume that under 
the network eonstraint eaeh agent is limited to at most two neighbors. Henee, for a given 
number of agents, they would only be able to form one or multiple ring networks of different 
sizes. This eould represent a work environment in whieh agents work in pairs on projeets and 
ean work on up to two projeets at a time, or a finaneial network in whieh finaneial institutions 
seek two partners to trade with. 

We study how the size of different rings affeets the welfare an agent obtains and hence, we 
can determine the optimal size of the rings that agents should form together. Let W (n) denote 
the welfare an agent can obtain if it is in a ring of size n under the network constraint We 
show that networks with rings of three agents (so there is triadic closure among the agents) will 
maximize both agent welfare and overall social welfare p] 

Proposition 9. The optimal size of a ring network is 3 agents. 

Proof. Consider a ring network consisting of three agents i, j, k. We focus on the welfare of 
agent i and show that it is maximized compared to rings of other sizes. Since agents are identical, 
this means that total social welfare is maximized as well. 

Agent i obtains a positive benefit in two cases: (1) realizations in which both agents j and fc’s 
reputation never hit c; (2) realizations in which exactly one of agents j and fc’s reputation never 
hits c. The probabilities that these two cases happen are independent of the network structure 
by Proposition In the first case, having additional agent(s) between agent j and k does not 
affect agent Ts realization and hence, agent i’s welfare is not affected. In the second case, having 
additional agent(s) between agent j and k will change i’s realization with positive probability. 

^^For convenience we assume that the number of agents N is divisible by n 

^"^The social networks literature views triadic closure as the result of common preferences or trust, whereas our model derives 
a reputational reason for such networks. 
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Consider a realization in which agent k’s reputation never hits c and agent j’s reputation hits c 
at tj. In the ring of size 3, agent j’s direct neighbor besides i (i.e. agent k) never hits c. When 
there are additional agents, it is either the case that agent j’s new direct neighbor never hits c 
or hits c before infinity. If agent j’s new direct neighbor hits c before infinity, then agent j’s 
new hitting time may increase and hence agent i’s new hitting time may decrease, leading to a 
lower welfare for agent i. □ 

The intuition behind this result is similar to the reasoning of Proposition in which having a 
direct neighbor send more information is beneficial for an agent. With only three agents in each 
ring, an agent learns about a neighbor that would be excluded from the stable network at a faster 
rate, since that neighbor remains connected with the other neighbor, when the other neighbor is 
included in the stable network, until the first neighbor is ostracized. This guarantees a fast rate 
of learning about the low expected quality neighbor, allowing the agent itself to have more time 
to stay connected with the high expected quality neighbor that is not ostracized. With more than 
three agents, the neighbor that is excluded from the stable network may have its own neighbor 
disconnect in advance, slowing the rate of information the ostracized neighbor produces and 
hurting the agent itself. 

We can extend this result to ring networks with more than three agents. Similar to the odd/even 
effect highlighted in Corollary we can show that rings with an odd number of agents will 
always have higher expected social welfare than ring networks with an even number of agents. 
However, as the number of agents grows large the difference in the social welfare of an even 
and odd number of agents eventually goes to zero. 

Corollary 4. Ifn is odd, then W{n) > W{m),\/m > n. Ifn is even, then W{n) < W{m),\/m > 
n. Moreover, W (n) converges to a limit as n approaches infinity. 

Proof. The proof is similar to that of Proposition except we take into account the odd-even 
effect discussed in Corollary We still only need to consider the case when exactly one of 
agents j and fc’s reputation never hits c. Without loss of generality assume that j is not included 
in the stable network. With four agents, social welfare is lower than with three because the 
neighbor of j, call it I, may be ostracized before than j is ostracized, causing j’s information 
speed to slow down. With five agents, social welfare is higher than with four because in the same 
case, there is a chance that agent Vs other neighbor is ostracized before agent I is ostracized. 
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resulting in a decrease in agent /’s information speed and an increase in agent j’s information 
speed. This argument can be extended indefinitely for any number of agents to prove the above 
result. We note that the limits of the social welfares are the same, since the probability of a 
neighbor very far away sending a signal that affects agent j’s hitting time approaches zero as the 
number of agents becomes very large. Such an event can only occur if all the agents in between 
have an ostracism time less than agent j itself, an event with probability that approaches zero 
as the number of agents gets large. □ 


VIII. Extensions 

As seen above, learning can have a negative impact on social welfare in a variety of networks, 
and a large reason for this is the myopia of the agents. Since the agents are not experimenting 
for long enough, learning is inefficient and social welfare is lost. In this section, we consider 
four possible extensions that could alleviate this issue and allow for higher social welfare. 

A. Linking Subsidy 

A potential method of addressing the negative effects of learning is to give subsidies to the 
agents for linking with others. For instance, a company may wish to give workers awards or 
bonuses for collaborating with colleagues. Or in a financial setting, a regulator may give financial 
incentives for firms conducting mutual investments, or guarantee interbank transactions during 
a financial crisis to lower default risk. We model a subsidy by assuming that for every link 
that an agent maintains, it receives an extra flow benefit of 5 from the network designer. This 
linking subsidy does not affect the social welfare computation since it is a direct transfer from 
the network designer to the agent, but it would change agents’ decisions of when to break a 
link. Since agents are myopic, an agent i will break its link with agent j if and only if agent 
j’s reputation drops below c — 5. The linking subsidy therefore causes the agents to learn more 
information about their neighbor’s quality and break only if it is very likely to be bad. We show 
below that by properly choosing the linking subsidy the social welfare can improve compared 
with the case when there is no learning about agents’ qualities. Let W{5) denote the ex ante 
social welfare when the linking subsidy is equal to 5. 

Theorem 5. There exists 5 such that \/6 > 5, W{5) > W*. Moreover, lim W{6) = W*. 

5^00 

Proof. See appendix. □ 
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Note that by Theorem this result also shows that the soeial welfare is higher than the 
standard network model with no subsidy. Thus by imparting subsidies on agents to eneourage 
them to experiment for longer, the soeial welfare is higher than previously. The intuition is that 
when the link subsidy is high enough, any link that is broken will involve an agent that is of 
really low expeeted quality. Thus although the agent that is ostraeized may still hurt from being 
diseonneeted, its neighbors will benefit by a suffieiently large amount that overall soeial welfare 
inereases. Therefore learning is now benefieial and improves welfare overall. The seeond part 
of the theorem states that if the linking subsidy beeomes too high, then the soeial welfare will 
eonverge to the soeial welfare without learning. This is beeause when the subsidy is too high it 
beeomes almost impossible for a link to break, and so the network with high probability will not 
ehange, just like in the ease without learning. Therefore having a linking subsidy is benefieial for 
the network, but the subsidy eannot be set too high either in order to maximize soeial welfare. 

B. New Link Formation 

Another way that learning would be more soeially benefieial is if agents were able to form 
new links with other agents whose reputations are very high. In this extension, we assume that 
a pair of agents who are not initially linked aeeording to the network eonstraint kl ean form 
a new link by ineurring an instantaneous eost 7 > 0. There is no eost to forming links with 
agents that they are eonneeted to under kl. So unlike previously when there was a hard barrier 
between agents not eonneeted aeeording to fl, agents ean now break this barrier by paying an 
instantaneous eost. This eost eould be exogenous, for instanee the eost of time and energy in 
beeoming familiar with a new agent, or the eost of redueing some physieal barrier between the 
agents (distanee or geographie barriers). The eost eould also be set by the network designer sueh 
as a tax on link ereation. Sinee we assume the formation eost is instantaneous, it is infinitesimal 
in the soeial welfare ealeulation and so only affeets welfare through its impaet on agent aetions. 

We assume that forming a link this way requires bilateral eonsent as usual. Agent i will want 
to form a link with agent j if agent j’s reputation is higher than c + 7 . Therefore a new link 
between agents i and j is formed at time t if and only if /i- > c +7 and /x* > c+ 7 . The dynamies 
of our model will now feature some agents attaining high reputation levels and being able to link 
with other previously inaeeessible agents that have also attained high reputation levels. Allowing 
these two high expeeted quality agents to link together will improve soeial welfare due to the 
large mutual benefits that are generated from their link. 


39 


We can compare the social welfare produced by allowing this extra link formation against the 
social welfare in the basic model. Let W{'y) denote the ex ante social welfare when the link 
formation cost is equal to 7 , and let W be the social welfare in the basic model without the 
extra link formation. 

Theorem 6. There exists 7 such that V 7 > 7 , W ( 7 ) > W. 

Proof. Consider any realization e when link formation is not allowed. The ex post welfare W (e) 
is changed only when there is some time t* such that there exist two agents i and j, who are 
not initially connected, such that /if > c + 7 and /if > c + 7 . In the original realization e, 
conditional on t*, there are two cases 

• Ci: Both agents’ reputations never hit c after t*. 

• C 2 : At least one agent’s reputation hits c after t*. 

When C 2 occurs, allowing link formation may change the hitting time of all agents’ in the network 
and hence, the welfare W{e\( 2 ) may change. However, the probability of (2 occurring tends to 
zero as 7 tends to infinity by Proposition When occurs, the social welfare increases by at 
least g-f- (g+ 7 )-^f-f(Ci))c ^ When (2 occurs, the welfare decreases by at most B{( 2 ), a function 
that is at most linear in 7 as it grows large, since the set of agents and their initial qualities are 
fixed. Thus the overall change in welfare can be written as 

W\e) - W(e) > P(Ci) " (^ + T')-a--P(Ci))c _ (16) 

p P{Ci) 

By choosing 7 large enough, we can ensure that P(C 2 ) is small enough such that the change 
is positive in all such realizations e. Therefore W ( 7 ) > W. 

□ 

This theorem states that if the link formation cost is high enough then the social welfare is 
improved over the base model because two agents that decide to form a new link will do so 
with high reputations. Thus the social welfare generated by a new link is likely to be high as 
well, and this dominates any potential informational externalities that the link could create. Note 
however that a 7 that is too low may actually harm welfare. For instance suppose there are a 
group of moderate expected quality agents that are all linked to a very high expected quality 
agent, but separated from each other according to fl. This is similar to the core-periphery setting 
examined in Theorem In such a case, allowing moderate reputation agents to link with each 
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Other would cause them to harm each other via the negative informational effects of the link. 
This would reduce welfare overall compared to the base model. Therefore allowing for new link 
formation can improve welfare, but the threshold for the link being formed must be sufficiently 
high as well. The optimal 7 would depend on the specific properties of the network. If as in 
the example there exists a group of very high reputation agents that the moderate reputation 
agents are linked with, then 7 would likely be higher as well, as it becomes more important for 
moderate reputation agents to not be linked with each other. 

C. Agent Entry 

Our model can also be tractably extended to allow agents to enter into the network over time. 
Specifically, suppose that for the set of N agents in V there is a corresponding set of entry times 
{cijigy, with Cj > 0 Vi. Agents with Cj = 0 are present in the network at the beginning, while 
agents who have e* > 0 enter later on. These entry times are fixed and known to the agents in the 
model. The network constraint Q is now defined over the set of all N potential agents and still 
specifies which agents are allowed to connect to each other, including agents that arrive later. 
This network constraint effectively determines where agents enter into the network at their entry 
times. The learning process is the same as before, with learning occurring for agents within the 
network based on their current amount of neighbors, and no learning occurring for an agent that 
has not yet entered. 

Agents still make decisions myopically and will connect with a neighbor for as long as 
that neighbor’s reputation is above the connection cost. Since we assume all agents have initial 
reputations above the cost, an incumbent agent with always wish to connect with a newly entering 
agent. However, the new agent would not want to connect with one of its neighbors that has 
already been ostracized previously within the network. The dynamics will evolve similarly to 
before, with agents connecting to neighbors until a neighbor’s reputation falls too low, at which 
point the neighbor will be ostracized. The difference now is that new agents will arrive at certain 
times, and when they do they will change the benefits and amount of information produced by 
the network. 

We can compare the model with agent entry against the base model where all agents were 
present in the beginning, i.e. Ci = 0 Vi G V. We fix a network constraint and perform 
comparative statics on the entry times of the agents. We first show that incorporating agent entry 
will not change either the set or the distribution of stable networks. 
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Proposition 10. The set of stable networks is unchanged with agent entry. The probability of 
each stable network emerging is the same as that given in Corollary and identical to the case 
without agent entry. 

Proof. First note that Proposition still holds for each agent, regardless of the specific entry 
times. This is because the later entry of an agent only shifts the time at which it gets ostracized, 
but will not change the fact that it ever gets ostracized. Since the probability that each agent is 
ostracized is not affected, the set of stable networks and the probability that each stable network 
emerges does not change either. Thus the same probability distribution over stable networks as 
in corollary will result. □ 

Although the properties of the final stable networks are not affected by agent entry, the overall 
social welfare will be affected. It is possible to calculate social welfare in a similar method as 
in Theorem as we can account for agent entry by rescaling the hitting times of the agents in 
the network appropriately. Incorporating agent entry has two separate effects on social welfare: 
first, the links that the entering agent has are started later, so the benefits from those links are 
realized later as well and thus discounted more heavily; second, the neighbors of the entering 
agent send less information before that agent enters, and the agent itself may send information 
more slowly if one of its neighbors is ostracized before it enters, delaying the time at which the 
agent and its neighbors are potentially ostracized from the network. The first effect hurts social 
welfare because the benefits from any link are positive in expectation. However, the second 
effect can improve social welfare by delaying the agents’ ostracization times and increasing the 
benefits that each agent is able to extract from the network. It is possible for the second effect 
to dominate the first, so that delaying entry for an agent raises social welfare overall. 

Theorem 7. For some network parameters, increasing a single agent’s entry time Ci can increase 
social welfare. 

Proof. We prove using an example, shown in Figure In this network of three agents, suppose 
that the white agent’s expected quality is very high. Suppose both the red (large circle and 
bolded line) and the blue (small circle and thin line) agents expected qualities are very close to 
c. Since the white agent’s expected quality is very high, the social welfare of the network will 
be completely determined by the amount of time the blue agent connects with the white agent. 
By delaying the entry of the red agent, the blue agent is able to stay connected for longer in 
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Fig. 5. Example for Theorem 

each realization, and so social welfare increases. □ 

In the example of Figure note that although delaying the entry of the red (large circle and 
bolded line) agent is helpful, it is still better to have the agent enter at some finite time instead 
of never entering. This is because the blue agent’s reputation will eventually converge to its true 
quality by the law of large numbers, and in the case where the blue (small circle and thin line) 
agent has a good true quality, enabling a link with the red agent will produce positive benefits. 
In addition, after waiting for a sufficiently large amount of time, the probability that the blue 
agent ever becomes ostracized if it hasn’t already goes to zero, so the red agent is unlikely to 
impact the blue agent’s connection with the white agent. Therefore delaying the entry of the red 
agent is beneficial, but the red agent should not be excluded from the network altogether. 

Figure shows this trade-off explicitly via a simulation highlighting the example shown in 
Figure]^ When the new agent enters later, social welfare initially increases because the incumbent 
agents have more time to benefit from their links. However, if the entry time becomes too large 
then the social welfare decreases, since the reputations of the incumbent agents have stabilized 
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Fig. 6. Simulation Illustrating TheoremThe simulation uses a network of 3 agents who are linked according to Figure]^ 
All agents have = 20 and n = 1, and the discount rate is 1. The high expected quality agent has an initial mean of 100, 
while the other two agents have initial means of 1. The entry time of the entering agent ranges from .0005 to 2 at increments 
of .05. The y-axis shows the average social welfare of the network at each entry time. 80000 draws were made of each agent’s 
hitting time and true quality level, and the social welfare was then computed by varying the entry time. 


already, and it is thus better to have the new agent enter sooner and benefit from the network 
as well. 

As an implieation, a finaneial regulator may wish to delay new firms from entering the network 
in times of erisis when there is a lot of uneertainty, and then allow them to enter onee the erisis 
has ended and reputations are more stable. Or in an organization, a firm may wish to not expand 
too quiekly, and instead take the time to allow the eurrent workers to better understand eaeh 
other first. 

D. Agent Re-entry 

Our model ean be traetably extended to allow agents to be forgiven and then let baek into 
the network. For instanee, suppose that a worker in a eompany ean improve its quality after it 
beeomes ostraeized through some exogenous means, sueh as going baek to sehool to inerease 
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its abilities, or taking counseling to better its personality. In financial networks, suppose that 
a bank can get recapitalized by the government after it gets shut out of the network, allowing 
its expected quality to increase. After the agent undergoes this exogenous process, the agent’s 
reputation improves and so the other agents are again willing to link with it. We show that agent 
forgiveness in this manner can increase social welfare as well as mitigate the negative effects of 
learning. In fact, learning may now actually become beneficial. 

We model agent forgiveness by assuming that when an agent is ostracized from the network, 
the agent can reenter the network at a later time. How long the agent must wait before reentry 
is an exogenous parameter, which we denote by L. When the agent reenters, its reputation is 
the same as the reputation that it started out with initially, As mentioned above, 

this re-entry could be the result of the agent undergoing additional training or preparation to 
improve its quality. An alternative interpretation is also possible where this is in fact a new agent 
entering the network, but from the same population or background as the original agent. Thus 
the new agent starts out with the same reputation as the original agent. 

We assume that each agent can reenter into the network as long as it has not already been 
ostracized in the past a total of R times. Therefore an agent can reenter the network as long as 
it has not already reentered R—1 times in the past. R is an exogenous parameter that represents 
the number of times which ostracized agents are willing to undergo the process to improve 
themselves, or the number of replacement agents that can be brought into the network. A higher 
value of R means that the ostracized agents are willing to undergo the improvement process 
even if they have been ostracized multiple times in the past. 

With agent re-entry, we can still compute the set of stable networks, as well as the probability 
that each stable network emerges. The probability that an agent is included in the stable network 
is now equal to the probability that an agent does not get ostracized a total of R times in a 
row. Since the agent’s reputation is redrawn each time upon re-entry, this probability can be 
computed using the products of the probabilities in Proposition The exact formula is given in 
the following proposition. Compared with the original probabilities, agent re-entry implies that 
each agent is more likely to be part of the stable network, since they have more chances with 
which to get a high true quality draw. 

^^We make this assumption to avoid adding too many new exogenous parameters. Our results can be extended to a more 
general setting as well where the reputation changes upon re-entry. 


45 


Proposition 11. P{Si) depends only on the initial quality distribution and the link cost and can 
be computed by 


P{Si) = 1 - 1 - 


' 2 / 1 , 

(1 - exp(- 2 - c)))0 {qi - Pi)— dqr 


crt 


Oi 


R 


(17) 


Proof. The probability that an agent is ostraeized permanently is found by taking the 1 minus the 
probability in Proposition and then raising that to the power of R. Therefore the probability 
that an agent is ineluded in the stable network is found by taking 1 minus this probability. □ 

Note that sinee this probability is very similar to the probability given in Proposition all of 
the relationships between this probability and the exogenous parameters (initial mean, varianee, 
signal preeision, and link eost) highlighted in Corollary and Theorem eontinue to hold. In 
addition, we can derive an analogue of Corollary using these new probabilities. Thus we can 
still characterize the explicit probability that any stable network emerges as time goes to infinity 
and the re-entry process by all the agents has concluded. 

We can also derive results about agent welfare when re-entry is possible. Specifically, we 
can show that if the number of periods of re-entry R is sufficiently large, and the time that an 
agent takes to reenter L is sufficiently small, then learning becomes beneficial. This is intuitive, 
because if agents are learned about faster, then bad agents can exit the network sooner to 
undergo improvement while good agents will stay in and are unaffected. Having agent forgiveness 
mitigates the negative effects of learning, and makes learning a positive overall. 

Theorem S. If R is sufficiently large compared to Ti, and L is sufficiently small compared to 
Ti, then a small increase in r* increases the overall social welfare of the network. 

Proof. Note that as R converges to infinity, the probability that each agent is included in the 
stable network goes to 1. Therefore the social welfare generated by any agent i will depend 
on the first time instance at which it enters and does not become ostracized. This is because L 
is very small, so agent i loses very little benefit when it is ostracized. The first time at which 
agent i reenters and does not get ostracized is strictly decreasing in its information precision Ti, 
since a faster information speed implies that it gets ostracized earlier later on. Therefore a larger 
signal precision increases overall social welfare. □ 
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We can extend the above result to show that a fully connected network is the optimal when 
the network is very forgiving and the downtime of reentry is low. A fully connected network 
would allow all agents to link with each other and benefit from the resulting mutual interactions. 
In addition, since learning is now beneficial, the fact that each agent has many links in a fully 
connected network and thus sends a lot of information also increases social welfare. This result 
highlights the fact that with agent forgiveness, more densely connected networks can become 
optimal, and the designer can allow for more links in the initial networks. 

Theorem 9 . If R is sufficiently large compared to Tifor all i, and L is sufficiently small compared 
to Ti for all i, then a fully connected network is the optimal 

Proof Similar to the above proof, note that as R converges to infinity, the probability that each 
agent is included in the stable network goes to 1, and so the social welfare generated by any agent 
i will depend on the first time instance at which it enters and does not become ostracized. With 
a fully connected network, each agent has as many neighbors as possible and sends information 
very quickly, and so the timing of this first time instance becomes sooner. Notice also that each 
agent’s flow payoff is positive at any point in time that they are in the network. Since L is 
very small, agents are in the network almost continuously, and so having more links increases 
the flow benefits that each agent receives. Thus a fully connected network is the optimal initial 
network. □ 


IX. Conclusion 

This paper analyzed agent learning and the resulting network dynamics when there is incom¬ 
plete information. We presented a highly tractable model that explicitly characterized what the 
set of stable networks are for a given network, showed how learning affects both individual and 
social welfare depending on the specific network topology, and analyzed what optimal initial 
network structures look like for different groups of agents. Our results shed new light on network 
dynamics in real world situations, and they offer guidelines for optimal network design when 
there is initial uncertainty about the agents. When agents are sufficiently myopic in their actions, 
ostracism becomes harmful not just for the ostracized agents themselves, but to all agents in 
an ex ante fashion. A network designer should thus structure links appropriately in order to 
minimize the negative effects of ostracism. 
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Our results could be extended in several interesting ways. One natural extension would be 
to allow the qualities of agents to evolve over time. In the simplest extension, the agent’s true 
quality g* itself change according to an exogenous stochastic process, for instance a Brownian 
motion. More interestingly, it would be natural to assume that the evolution of true quality 
depends endogenously on the information the agent receives so that agents who receive better 
information tend to develop higher true qualities and hence also generate better information in 
the future. Thus, the structure of the network and the true qualities of the agents in the network 
co-evolve. Higher reputation agents may link to agents that are also of higher reputation, and so 
their true qualities would improve as well, while lower reputation agents may struggle to find 
good agents to link with, and their true qualities would decline as a result. 

Other possible extensions include having private information among the agents instead of 
locally public information. In this way agents would learn about their neighbors at different 
rates, and so they may make different decisions when connecting or disconnecting with other 
agents. This result could mitigate the negative effects of learning, as information is different 
across link, and so having more links does not increase the rate of learning. Agent preferences 
could also be heterogeneous, which would further increase the diversity of links and the range 
of linking decisions. This is a topic we are currently researching in van der Schaar and Zhang 
(2015). 

Finally, it would be interesting to allow agents to engage in games with their linked neighbors 
instead of merely generating flow benefits. Games played over networks have been analyzed in 
several papers within the networks literature (see Jackson and Zenou (2014) for a review), but 
never in a dynamic setting with learning such as that considered in the current paper. The game 
played by agents could be a prisoner’s dilemma or another type of cooperation game where the 
payoffs depend on the agent’s types. Agents would need to seek out other agents that they can 
achieve high payoffs in the game with, and this process would also require learning over time 
about a neighbor’s type. As agents are able to learn each other’s type more accurately, they may 
achieve greater efficiency in their plays and also sustain cooperation for a longer length of time. 
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Appendix 

Proof of Proposition [H 

Proof. Suppose for now that agent z’s reputation always evolves at the constant signal precision 
Tj. Then given the true quality qi for agent i, the probability that agent i’s reputation never hits c 
before t can be found using standard arguments (see for example Wang and Potzelberger (1997)) 
and is given by 


P{Sl\qi) = $ - c) + 

2 ( 
exp(-2(/^* “ c)(gi - c))$ ( y/tffqi - c) 




ITi 

4 (/ij - c) 


\/iTi 


Therefore, given qi, the probability that agent i stays in the network is 

P{S,h) = P{St\qi) 

/ _ 

• If g* > c, as f —)■ cxD, then we have $ ( ^/Prfqi — c) + ^ j —>■ 1 and 

A (Mi— c) 


(18) 

(19) 

( 20 ) 


—2” (/^i — 

\/P^(Qi - c) - j 1. Thus, P(Silqi) = 1 - exp(-^(/j,i - c)(qi - c)), namely 

agent i stays in the network with positive probability and the probability is increasing in 
the true quality g*. 

If gi < c, as f —)■ oo, then we have $ ( y/trfqi — c) + ^ 0 and 

/ ,_ ACmi—c) 

$ ViPfqi - c) - 




\/in 

—)■ 0, thus P{Si\qi) = 0, namely agent i’s reputation hits c 

before f = oo for sure. 

• If gi = c, it is clear that P{Sl\qi) = 0 as f —)■ oo. 

Taking the expectation over g*, we have 


P{Si) = (1 - exp(- ^{fXi - c){qi - c)))0 (g^ - /i^)— dqi/a, 




CTi 


( 21 ) 


From the above expression we can see that P{Si) only depends on the initial quality distribution 
(/Xj and (jj) and the link cost c but does not depend on the Brownian motion precision r*. Since 
breaking links only changes the Brownian motion precision, the probability that an agent’s 
reputation never hits c is independent of the initial network or the signal precision r*. □ 
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Proof of Corollary [T] 

Proof. We first show that P{Si) is increasing in /Xj. Let Qi — ^ii = x. Then P{Si) can be rewritten 


as 


PiS^) = 


C lli 


(1 - exp(-^(/ii - c)(/ii - c + a;)))0 ( — ) dx/ai 


Consider a larger expected quality p' > pi, we have 


P{Si\Pi) = (1 - exp(-2 (/i- - c)(/i' - c + x)))(j) — dx/ai 


POO 


at 


X 


> / (l-exp(- ^{fii-c){fi'i-c + x)))(j){ — ] dx/ai 


C fli 


CTJ 


ai 


X 


ai 


> 


(1 - exp (—\{iii - c)(/ri - c + a;)))0 ( — ) dx/ai = P{Si\^,i) 


' C jli 


at 


(yi 


Therefore, P{S/) is increasing in pj. 

Next we show that PiS/) is decreasing in ai. 


X 


P{Si)= / (/){—)dx/ai - / e 




' c—in 


C fli 


a, 




1 ( X 

^dx 


X 


'c-in 


(p{ — )dx/ai 

0‘i 


(j){ — )dx/ai - 


1 -:^{2{iii-c)+xf) 

- ""i dx 


'c-fii ^i 
X 




f* /i-i C 


f fii-C 


(j){ — )dx/ai = 
(Ji 


C fli 


</){—)dx/ai 

Oi 


Therefore, P{S/) is decreasing in ai. 

Finally, we show that P(S'j) is decreasing in c. Consider a smaller d < c, we have 
P{Si\c) = (1 - exp(-^(/ii - c){qi - c)))0 {{qi - dqi/ai 

poo o 

I , ^ p ^ , Ik , Ik K K 


(Jf 


< (1 - exp(- ^{^i - d){qi - c')))0 (g* -Hi)—] dqi/ai = P{Si\d) 


at 


ai 


( 22 ) 

(23) 

(24) 

(25) 


(26) 

(27) 

(28) 


i Hi) 

^ dqi/ai 

(29) 

0-i 


i Hi) 

^ dqi/ai 

(30) 

O'* 


dqi/ai = 

P{S^\d) 

(31) 


The first inequality is because for g* > c, 1 —exp(—^(/ij —c)(gi —c)) < 1 —exp(—-^(/i* —c)(gi — 

^ i ^ i 

c)). The second inequality is because for c' < g* < c, 1 — exp(—-^(/Xj — c)(gj — c)) > 0. □ 
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Fig. 7. Set of Stable Networks Given an Initial Network: This figure shows the five possible stable networks that could emerge 
given an initial network of three agents. In addition, P{Si) is given for all the agents, which allows us to calculate the exact 
probability of each of these networks emerging. For the first four networks, there is only one realization of {Si,^ S'+igv' that 
corresponds to it. For the last network, there are four possible realizations, one in which ^Si occurs for all agents, and three in 
which Si occurs for a single agent. 


Proof of Lemma [H 

Proof. Since the Brownian motion precision is constant, using the survival probability 

/ — c) 

P{Sl\qi) = $ Virfqi - c) + — 




exp(-2(1^* “ - c))$ y/ifi{qi - c) 


<J7 


tTi 

^ (Pi ~ c) 

i _ 


(32) 

(33) 












51 


we can compute f{el\qi) = — as 

(Vriiqi - 0 (vWi{qi - c) + 

2 V J \ y/tTi 

-^{^li-c){qi-c) \ ( _ N _1 /9 l^i ~ C _r,/n\ , f , -, . ^(/^* ~ ^) 

+e - v^(gi - c)t ^ t M 0 VtTiiqi - c)-^ 

2 V J \ ytn 


(v^Mi - c)t ^ (vtTiiqi - c) + 

2 \ \/^i ) \ ytTi 



tTi 

^(/ij - c) 
^ _ 

Vi^i 


Taking the expectation over qi, we obtain /(e-0. 


( 34 ) 

( 35 ) 

( 36 ) 

( 37 ) 

( 38 ) 

□ 



Hitting Time Mapping Function 
Input: base precision r,.V/ and initial graph G'\ 

Output: new hitting time vector t. 

Initiate: r/j = f/V,. r, = Afr, 

Initiate: Af = {/ : f'/ < oo}, = 0, V/ e Af and /F"” = oo.V/ ^ Af/ 
while Af ^ i/> do 

Let /* = minigjvf/j/ri. 

Update tf ■- tf + r/i./i’,..V/ e Af. 

Update (Ij := f/, — r, x di-ji'i-. 

Update Af := A/’/i*. 

Update k'i = max{l,A-i — 1}, for all i such that (jk. = 1. 

Update i'i = A-, Ti 

end while 


Algorithm for Computing Hitting Time Mapping Function M 
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Proof of Theorem [3] 


Proof. Consider the ex ante surplus Wtj that agent i obtains from the link with a neighbor j. 
The ex ante welfare for agent i is simply the summation of this surplus over all j that i is linked 
with. Wij ean be eomputed as 

p poo 

W,,= e-^^P{Ll^\q){qj-c)dtcl>{q)dq (39) 

Jq Jo 

where P(Lk|g) is the probability that the link between i and j still exists at time t. Let t* be 
the time at whieh the link between i and j is broken. Then the soeial welfare ean be eomputed 
as 




qJO 


e - c)dt(p{q)dq - Et* [e {qj - c\t > t*)dt] 

poo poo 

= / - c)dt - Et*[j e~^^Eq. {qj - c\t > t*)dt] 


(40) 

(41) 


where the expeetation is taken over the realizations in whieh the hitting time is t*. The seeond 
term ean be further deeomposed. Let t* denote the ease when t* = U, namely agent Ps reputation 
hits c before agent j, and t* be the ease where t* = tj, namely agent j’s reputation hits c before 
agent i. Then 




e P\^j-c)dt-Et*[ I e f’^Eg^{qj-c\t >t*)dt]-Et*[ j e ''^Eg.^qj-c\t > t*)dt] 

(42) 


It* 


It* 


In the case of t*, for any t > t*, since the learning has stopped, Eq.{qc — c\t > t*) = 0 by the 
definition of t*. Similarly, Eg-{qj — c\t > t*) > 0 because at t* the expected quality of qj is 
strictly greater than c since agent j has not been ostracized. Therefore, 

poo 

»« = »" - -B,. I / (q, - c\t > t‘)dt] < MA- (43) 

■It; 

Summing over all j that i is linked with, we conclude that the agent i’s ex ante welfare 
with learning is strictly less than that when there is no learning. Note that this result holds 
independently of the values of ri,...,rjv, as the signal precisions affect only the distribution 
of agent hitting times, but not the expected quality of the agents conditional on ever being 
ostracized. □ 
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Proof of Proposition [5] 


Proof. From Theorem we have that the ex ante social welfare is given by: 

1 — fij — c 


w = e,J2 


p 


E 


j-g^j=i,tj=oc 


P{Sf 


(44) 


If the designer is completely impatient, it only cares about the social surplus at time 0 since with 
probability approaching 1 no links will be broken among the agents. Since all agents’ expected 
qualities are above the linking cost, having all agents connected with each other yields the highest 
social surplus. Similarly if learning becomes very slow, then the agent’s reputations are never 
updated and the same reasoning applies. In both cases the term approaches zero in the 

above equation regardless of the network structure, and so adding more agents increases welfare. 

If the designer is completely patient, only the stable networks matter. Since the stable network 
does not depend on the speed of learning and the probability that an agent stays in the stable 
network is independent of others by Proposition having all agents connected with each other 
leads to the maximum number of links in the stable networks and hence the highest social surplus. 
Similarly if learning becomes very fast, the stable network will always be reached immediately 
and the same reasoning applies. In both cases the term approaches one in the above 

equation regardless of the network structure, and so adding more agents increases welfare. □ 


Proof of Theorem [5] 


Proof. (1) Consider the welfare on link As in the proof of Theorem]^ let t* denote the event 
in which agent i’s reputation hits c — 6 at time t* before agent j. The ex ante welfare of link 
can be computed as 


Wij + W,i = 





-P^E. 




oo 


+ P^j - ‘^c)dt 

(45) 

(<?i + Qj - 2c|Z > t*)dt\ 

(46) 

idi + Qj - 2c|Z > t*)dt] 

(47) 


Note that the first integral in the above equation represents W*j + W*^, the social welfare of the 
link without learning. 

In the case of t*, for any t > t*, since the learning has stopped, Eg.^g. (g*—c|Z > t*) = c—6—c = 
—6 by the definition of t*. Since agent j is not ostracized, we would have Eg.^g.{qj — c\t > t*) > 
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—S. Let h{S,t*) = Eq^^q.{qi + qj — 2c\t > t*) = Eq.^q^{qj\t > t*) — 2c — S. This is the net change in 
flow payoff after the link is severed. We will show that for any t*, h{5, t*) < 0 if 5 is sufficiently 
large. A symmetric argument then establishes that < 0, and the two together imply that 

the welfare of the link with learning is greater than W*^ + W*^. Then, adding up over all links 
shows that the overall social welfare is higher than that without learning. 

To prove that h{6,t*) < 0 if 5 is sufficiently large, we will show that Eq^^q.{qj\t > t*) is 
bounded above for any t* as <5 tends to infinity. Consider any ex post realization of t*, which 
implies that agent j’s reputation does not hit c — 5 before t*. There are two possibilities for 
agent j’s reputation (here we assume that agent j continues sending information at its fixed 
signal precision if all its other neighbors are ostracized, as in section 4): 

• Cl • it never hits c — 5 after t* either. 

• C 2 : it hits c — 6 at some time after t*. 

Clearly, E{qj\Ci) > E{qj\C 2 ) = c-S. Hence Eq^^q.{qj\t > t*) < E{qj\Ci). The value of E{qj\Ci) 

is given by equation (6) in the text, with c replaced by c — <5. When <5 —)■ cx), using equation (6) 
we can show that lim i?(g, |Ci) = through the application of L’Hopital’s rule. 

5^00 ■' 

Therefore, Ve > 0, there exists such that \/5 > E{qj\(^i) —/i° < e. Hence, fix a value of 

e > 0 and let 5ij = max{(5k,/r° — 2c + e}, which ensures for all 5 > 5ij, E{qj\(i) — 2c — 6 < 0. 
This also implies that h{5,t*) < 0 for all t* and 5 > Sij. By choosing S = maxjj 5^, we ensure 
the overall ex ante social welfare is greater than W*. 

(2) Define Hij{5) = Et*[J,T e~^^Eq^^q.{qi + qj — 2c\t > t*)dt]. We will prove lim Hij{5) = 0. 

* <5—>-oo 

To prove this, we will show that for any sequence 6n —)■ 00 , the sequence Hij{6n) —)■ 0. We 
divide Hij{5) into two parts. 



poo 

/ Eq^.qj + Qj - 2c|f > t*)]dt 

Jti 

+Et*>t{s) 

poo 

J ^ Equqj [e“^‘(gi + Qj - 2c|t > t*)]dt 


= 


(48) 

(49) 

(50) 


for some i{6). We will find a sequence i{6n) such that both Hlj{6n) —)■ 0 and —)■ 0 as 

6 n —)■ 00. 

Let i{6n) = Sn. First we will show that for Sn large enough, P{t* < 6n) < tj- Note for a 
given qi, the probability that the agent is ostracized before time (5„ is equal to: 
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1 - P(S'f" |gi) = 1 - $ I -c + Sn) + 


-exp(-^(/ij -c + 6n){qi -c + 6n))^ | \/~^i{qi -c + 5n) - ^ 


- C + 5n)' 

^ i _ 

yj SfiTi 
(/^2 C 6 n ) \ 


Ti 




(51) 

(52) 


Note that liiiia^^oo <l>(a;) = 1 — |^=- Therefore the term above approaches zero faster than ^ 
as 6n —)■ oo. Integrating over all qi shows that P{t* < Sn) < -^ for large (5„. 

Now consider H\5n), it is bounded by 


poo 

\H\5n)\ < p{t* < 5n) sup I / Eq.^q^[e~P\qi + qj - 2c\t > t*)]dt\ (53) 

Jt* 

suPi-<i,. I /” + % - 2c|i > i;)]*| 

< -^- (54) 

< sup \E[qj\Ci] -c + Sn\ (55) 

P<)n t*<S„ 

Since as 6n —)■ oo,E[qj\(i\ —)■ /r°, we conclude that \H'{6n)\ —)■ 0. 

Consider H"{5n), it is bounded by 

poo 

\H''{6n)\ < sup \ Eq^^q.[e~P\qi + qj-2c\t>t*)]dt\ (56) 

t*>Sn Jt* 

< I / + q, - 2c|i > i-)ll (57) 

Similarly, since as (5„ —)■ oo, E[qj\t > t*] —)■ we conclude that \H''{5n)\ —)■ 0. □ 
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